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INTRODUCTION 

Historically,  most  problems  of  statistical  inference  have 
been  formulated  as  those  of  estimation  or  testing  of  hypotheses. 

In  many  practical  situations  the  experimenter  is  faced  with  the 
problem  of  comparing  several  populations.  Suppose  we  have 
k(k  >_  2)  populations  whose  qualities  are  characterized  by 

real -valued  parameters  6i,...,0k,  respectively.  The  classical 
approach  to  this  problem  is  that  of  testing  the  hypothesis  of 
homogeneity,  i.e.,  HQ:  =...=  ek-  But  in  many  situations,  the 

goal  of  the  experimenter  is  not  just  to  decide  whether  all  the 
parameters  are  equal  or  not.  One  of  the  more  frequently  occurring 


situations  for  which  this  is  so  arises  when  the  experimenter 
wishes  to  find  a subset  of  { tt,  , . . . , tt.  } which,  in  some  sense,  is 

I K I flTjj 

better  than  the  rest  of  the  given  populations.  Mosteller  (1948) J00c 

1 ‘-‘f v'NCrrj 

Paulson  (1949)  and  Bahadur  (1950)  were  among  the  first  research  Jus  i .. 


workers  to  recognize  the  inadequacy  of  the  classical  tests  of  j 


homogeneity  and  to  formulate  the  problem  as  multiple  decision 
problems  known  as  ranking  and  selection  problems.  ( 

i 

Two  formulations  for  selection  and  ranking  problems  have  I 


been  considered  in  the  classical  framework.  To  fix  ideas  suppose 


it.  is  better  than  tt,  if  e.  > e,.  Consider  the  problem  of  selecting 
J 1 J 

the  'best'  population,  i.e.,  the  population  associated  with 
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max  e..  The  first  formulation  is  called  the  'indifference  zone' 

1 

approach  of  Bechhofer  (1954);  the  experimenter  is  allowed  to  select 

only  one  population  which  is  the  best  one  with  a preassigned  minimum 

probability  P*,  whenever  the  unknown  parameters  lie  outside  a zone 

of  indifference.  Contributions  using  this  approach  in  the  decision- 

theoretic  framework  have  been  made  by  Bahadur  and  Goodman  (1952), 

Lehmann  (1966),  Eaton  (1967)  and  Alam  (1973)  among  others.  The 

second  formulation  due  to  Gupta  (1956,  1965)  is  known  as  the  'subset 

selection'  approach  in  which  the  experimenter  selects  a subset  of 

random  size  depending  on  the  observed  data  such  that  it  cuntains  the 

best  population  with  at  least  probability  P*.  Decision-theoretic 

contributions  in  this  framework  have  been  made  by  Studden  (1967), 

Deely  and  Gupta  (1968),  Berger  (1977),  Chernoff  and  Yahav  (1977), 

Bickel  and  Yahav  (1977),  Goel  and  tvu^in  (1977),  Hsu  (1977)  and 

Gupta  and  Hsu  (1978).  Especially  the  last  five  preceding  papers 

deal  with  Bayes  selection  rules.  An  up-to-date  and  comprehensive 

bibliography  for  both  these  approaches  can  be  found  in  Gupta  and 

Panchapakesan  (1979).  There  also  have  been  attempts  in  the  literature 

to  modify  these  basic  formulations.  In  one  such  modification, 

Fabian  (1962)  called  n.  good  if  e.  >_  max  e.-A  and  bad,  otherwise, 

l±J<k  J 

where  the  positive  constant  A is  usually  specified  by  the  experi- 
menter. The  goal  in  this  framework  is  to  select  the  good  populations 
and  screen  out  the  bad  populations.  Contributions  along  these  lines 
have  also  been  made  by  Fabian  (1962),  Desu  (1970),  Santner  (1976), 
Panchapakesan  and  Santner  (1977)  and  Brostrom  (1978).  Optimality 
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of  some  of  these  rules  has  been  studied  by  Bjtfrnstad  (1978). 

A slightly  different  situation  arises  when,  in  addition  to  the 
k treatment  populations,  a control  population  is  considered  and 
the  goal  is  to  partition  the  k treatment  populations  with  regard 
to  ttq . Following  the  initial  investigation  of  Paulson  (1952), 

Dunnett  (1955),  Gupta  and  Sobel  (1958),  Bhattacharyya  (1956,1958), 
Lehmann  (1961),  Tong  (1969),  Randles  and  Hollander  (1971)  and 
Miescke  (1979),  among  others,  have  studied  this  problem  in  several 
different  formulations. 

The  present  thesis  consists  of  investigations  of  multiple 
decision  problems  mentioned  in  the  preceding  two  paragraphs,  and 
also  some  related  topics.  In  Chapter  1,  the  problem  of  selecting 
good  populations  is  studied  from  a decision-theoretic  Bayesian 
point  of  view.  We  consider  a loss  function  which  seems  natural 
for  this  problem.  A theorem  is  proved  to  help  find  the  Bayes  rule 
with  respect  to  a permutational ly  symmetric  prior.  Some  properties 
of  the  Bayes  rule  are  derived  when  an  iid  prior  is  assumed.  Then  to 
get  insight  on  the  performance  of  the  Bayes  rule  the  loss  is  assumed 
to  be  c-|  if  we  select  a bad  population  and  c ^ if  we  exclude  a good 
population.  The  rest  of  Chapter  1 pertains  to  further  simplification 
and  approximation  of  the  Bayes  rules  since  these  are  often  analytically 
and  computationally  intractable.  Some  special  cases,  namely, 
normal  and  gamma  distributions  are  considered  with  specific  prior 
distributions.  Especially,  it  is  shown  that,  for  the  normal 
populations,  classical  rules  of  the  type  proposed  by  Gupta  (1956) 
and  studied  by  Desu  (1970)  in  this  framework  turn  out  to  be  close 
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approximations  to  the  Bayes  rules.  In  this  connection,  Monte  Carlo 
studies  are  also  performed  to  see  how  well  the  classical  rules 
proposed  by  Seal  (1955)  and  by  Gupta  (1956)  approximate  the  Bayes 
rules  in  terms  of  overall  risks  wrt  exchangeable  priors.  The 
results  of  this  study  indicate  that  the  rules  of  the  type  proposed 
by  Gupta  perform  almost  as  well  as  the  Bayes  rules  throughout  the 
cases  studied.  Similar  results  have  been  found  by  Chernoff  and 
Yahav  (1977)  and  Gupta  and  Hsu  (1978)  under  different  frameworks. 
Since  Bayes  rules  typically  require  numerical  integrations  to 
implement,  this  makes  them  usually  unsuitable  for  practical  use. 
Therefore,  it  was  deemed  useful  to  provide  tables  which  give  the 
'best'  classical  rules  with  performances  that  are  sufficiently  close 
to  those  of  the  Bayes  rules.  The  tables  also  provide  the  average 
number  of  bad  populations  selected  that  of  good  ones  excluded  and 
the  proportion  of  times  that  the  Bayes  rules  coincide  with  these 
classical  rules. 

Chapter  2 deals  with  the  problem  of  partitioning  k treatment 
populations  with  regard  to  a control  population.  The  goal  is  to 
partition  the  k treatment  populations  into  'better'  populations, 
'worse'  populations  or  'close'  populations  in  an  optimal  way. 

Loss  function  similar  to  the  one  in  Bhattacharyya  (1958)  is  assumed, 
and  r-minimax  rules  and  minimax  rules  are  derived  for  the  known 
control  case  when  the  parameters  of  interest  are  location  parameters. 
Normal  populations  with  unknown  means  and  known  variances  are  studied 


jl 


1 


»-/  vS.* 


as  a special  example.  When  the  parameter  of  the  control  population 
is  unknown,  both  r-minimax  rules  and  minimax  rules  (in  a certain 
class  of  decision  rules)  are  derived  for  the  location  parameter 
populations.  Normal  populations  with  unknown  means  and  known 
variances,  and  normal  populations  with  known  means  and  unknown 
variances  are  provided  as  examples.  The  results  regarding  the 
minimax  rules  generalize  a result  of  Bhattacharyya  (1958)  for  the 
normal  populations  with  a known  control  population.  In  the  last 
section,  comparisons  are  made  between  the  r-minimax  rules  and  the 
corresponding  Bayes  rules  wrt  independent  normal  priors  for  the 
case  of  normal  populations  with  a common  known  variance.  Tables 
are  provided  which  shed  light  on  the  relative  performance  of  these 
rules. 

Chapter  3 deals  with  a selection  problem  in  reliability  theory 
and  another  for  the  selection  of  scale  parameters.  The  first  part 
deals  with  the  problem  of  selecting  components  (units)  for 
parallel  and  series  systems  from  k populations  with  exponentially 
distributed  lifelength  times.  Loss  functions  are  assumed  which 
are  inversely  proportional  to  the  expected  lifelength  of  the  system 
corresponding  to  a possible  choice  of  the  units  of  the  system. 

A similar  problem  has  been  studied  by  Brostrom  (1977)  for  the 
l-out-of-2  system  when  there  are  only  two  populations.  An  optimal 
rule  is  given  for  the  series  system,  and  the  Bayes  rule  wrt  natural 
conjugate  prior  is  derived  for  the  l-out-of-2  system  when  we  have 
k populations.  Tables  to  implement  the  Bayes  rule  are  provided  at 
the  end  of  the  chapter.  The  second  part  of  Chapter  3 deals  with  the 


investigation  of  the  selection  procedures  based  on  robust  estimators 
of  measures  of  dispersion  for  selecting  the  populations  in  terms  of 
scale  parameters.  Several  selection  procedures  for  this  problem  have 
been  proposed  by  many  researchers,  most  of  them  being  extensions  of 
k-sample  tests.  The  approach  here  differs  from  these  in  that 
estimators  of  population  dispersions  are  directly  employed  in 
constructing  the  selection  procedures.  We  study  the  problem  of 
selecting  t populations  associated  with  the  t smallest  scale 
parameters  under  the  indifference-zone  approach  and  the  problem 
of  selecting  a subset  containing  the  population  associate.,  with 
the  smallest  scale  parameter.  Large  sample  solutions  for  both 
problems  are  derived  and  asymptotic  relative  efficiencies 
(following  the  definitions  of  Lehmann  (1963)  and  McDonald  (1969)) 
of  the  proposed  procedures  are  stud’'0d.  These  turn  out  to  be  same 
as  those  of  the  corresponding  estimators  in  Bickel  and  Lehmann 
(1976). 
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CHAPTER  1 

SELECTION  OF  GOOD  POPULATIONS 


1.1  Introduction 

Suppose  we  have  k populations  (k  >_  2)  from  which  we 

wish  to  select  a subset  which  contains  'good'  populations,  where  the 
quality  of  the  i-th  population  is  characterized  by  the  unknown 
parameter  ei . A population  it.  is  said  to  be 

good  if  e.  >_  e^-pA, 

0.1.1) 

bad  if  e.  < e^-A 

where  er.-i  = max  9-  and  A is  a specified  positive  constant. 

LKJ  l<j<k  J 

Clearly  we  wish  to  select  a subset  containing  as  many  good  populations 
as  possible  while  excluding  the  bad  ones.  Therefore,  any  reasonable 
loss  function  should  have  two  loss  components,  i.e.,  one  incurred  by 
excluding  the  good  populations  and  another  due  to  including  the  bad 
ones.  Selection  problems  for  the  related  goals  have  been  considered 
in  the  literature  by  Fabian  (1962),  Desu  (1970),  Carroll,  Gupta  and 
Huang  (1975),  Santner  (1976),  Panchapakesan  and  Santner  (1977), 
BrostrQm  (1978)  and  BjfSrnstad  (1978).  In  all  these  papers  the 
selection  rules  are  chosen  to  control  one  component  of  the  loss,  and 
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then  the  other  operating  characteristics  of  these  rules  are  studied. 
Recently  Miescke  (1978)  has  studied  this  problem  from  Bayesian  point 
of  view.  We  will  treat  this  problem  in  the  Bayesian  framework 
considering  two  loss  components  simultaneously. 

In  Section  1.2  some  definitions  and  notations  are  introduced,  and 
a decision- theoretic  formulation  of  the  problem  is  given.  We  describe 
loss  functions  which  seem  natural  in  this  framework,  and,  for  these  loss 
functions,  a theorem  is  proved  to  find  the  Bayes  rule  wrt  a permutation- 
ally  symmetric  prior.  Some  properties  of  Bayes  rules  are  also  given 
when  an  iid  prior  is  assumed. 

In  Section  1.3  it  is  assumed  that  the  loss  is  c-j  if  a bad 
population  is  selected  and  c^  if  a good  population  is  excluded;  the 
Bayes  rule  wrt  an  exchangeable  prior  is  derived.  For  the  same  loss 
function  some  results  about  the  ririmax  rules  are  obtained  for  k = 2. 

In  Section  1.4  a simplification  of  the  Bayes  rule  is  given  by  assuming 
a particular  form  of  the  posterior  distribution,  and  it  is  shown  that, 
in  some  sense,  some  rules  studied  in  the  past  are  natural  approxima- 
tions to  the  Bayes  rule. 

Normal  and  gamma  populations  are  studied  as  special  examples 
in  Section  1.5.  For  the  normal  populations  the  rules  of  the  type 
proposed  by  Gupta  (1956)  are  shown  to  be  asymptotical ly  equivalent 
to  the  Bayes  rule  (as  the  sample  size  approaches  infinity),  and  also 
it  is  shown  that  these  are  also  extended  Bayes  rules.  Further  simpli- 
fication of  the  Bayes  rule  for  gamma  populations  is  given. 


» — 


Section  1.6  consists  of  Monte  Carlo  comparisons  of  the  perfor- 
mances of  the  Bayes  *'"le,  the  rules  of  the  type  proposed  by  Gupta 
(1956),  and  those  proposed  by  Seal  (1955)  when  the  prior  is  assumed  to 
be  a permutational ly  symmetric  multivariate  normal  distribution.  The 
results  of  the  Monte  Carlo  study  indicate  that  the  rules  of  the  type 
proposed  by  Gupta  perform  almost  as  well  as  the  Bayes  rule  throughout 
the  cases  studied,  while  those  proposed  by  Seal  perform  poorly  in 
most  cases. 

1.2  Formulation  of  the  problem 

Let  X-|,...,Xk  denote  the  random  variables  representing  the  k 

populations  tt-| , . . . ,n^,  respectively.  Let  S^  be  the  symmetric  group 

of  all  permutations  v:  {1,2,. ..,k}  ->■  {l,2,...,k},  and  v . ■ denote 

^ J 

an  element  in  S^  which  interchanges  i and  j leaving  all  other  elements 
of  (l,2,...,k)  fixed.  For  x 6 Rk  and  f e Sk  define  x^  by  (xf)i  = 

x -1.* 

From  now  on  it  is  assumed  that 

(i)  given  e = (e^ , . . . ,0^)  6ak,  the  random  variables  X-j,...,Xk 
are  independently  distributed  with  Xi  having  p.d.f.  f(-,0i)  and 
(ii)  f(x,e)  has  the  monotone  likelihood  ratio  (MLR)  property  in 
x and  e. 

The  action  space  a consists  of  all  possible  non-empty  subsets 
of  {l,...,k}.  The  action  a 6 G is  interpreted  as  the  action  of 
selecting  the  populations  {ir^,  i 6 a)  which  are  asserted  to  be  the 
good  populations.  We  will  restrict  our  attention  to  loss  functions 
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which  are  invariant  under  and  monotone  in  the  sense  of  Eaton  (1967), 
that  is,  for  any  y fc  S^,  and  a € G, 

L(0H,,4'a)  = L(0,a)  and 

(1.2.1) 

1/ 

L(6,a)  j<  L ( e , f . -a)  for  e 6 @ , e.  > 0.,  i E a and  j f a 

I J * J 

where  fa  denotes  the  image  of  a under  f 6 S^.  Note  that  the  problem 
is  invariant  under  in  the  above  framework,  and  therefore  it  seems 
reasonable  to  consider  a prior  x invariant  under  S^.  In  the  remainder 
of  this  chapter  we  will  consider  only  such  a permutational symmetric 
prior. 

Since  Bayes  rules  are  of  main  interest,  attention  can  be  restricted 
to  the  non- randomized  decision  rules  6:  R -*•  a.  From  this  point  for 
sake  of  simplicity  we  will  use  6 also  for  the  action  6(x)  taken  by  the 
rule  6 whenever  no  confusion  arises.  Let  & denote  the  class  of  all 
non-randomized  decision  rules.  Let  r^(6,x)  denote  the  posterior  risk 
of  a decision  rule  6 6 given  x,  when  the  prior  is  given  by  t.  Let 
X[1  ] £ x[2]  <_...<_  x[k]  denote  the  ordered  observations  where  the  ties 
for  a label  are  broken  at  random,  and  and  denote  the  it  and  8 
associated  with  i = l,...,k.  For  j = l,...,k,  let  denote  the 

decision  rule  which  chooses  the  subset  associated  with  the  j largest 
observations  with  probability  one. 

By  partitioning  the  action  space  G into  k components  G ■ , 

J 

j = l,...,k,  where  U.  consists  of  all  the  subsets  of  size  j,  Goel  and 

J 

Rubin  (1977)  proved  the  following  result  which  seems  useful  in  the 
selection  problem. 
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Lemma  1.2.1.  If  the  prior  distribution  t of  0 is  permutational ly 
symmetric  on  0 , then  the  Bayes  rule  6*  for  the  loss  function 
satsifying  (1.2.1)  satisfies 


r (6*,x)  = Min  r ($.,x). 

1 l<j<k  T J 

When  our  goal  is  to  select  a subset  containing  the  'best' 
population,  i.e.,  the  population  associated  with  e^,  most  of  the 
loss  functions  considered  in  the  literature  contain  two  loss  compo- 
nents; the  optimal  solution  is  provided  by  the  rule  which  selects 
all  the  populations  if  we  were  to  study  the  problem  wrt  one  of  the  two 
loss  components  only,  and  the  optimal  solution  is  to  select  the 
population  associated  with  x^j  if  the  other  component  only 
is  considered.  In  our  formulation  the  experimenter  is  willing  to 
accept  all  the  populations  which  are  reasonably  close  to  the  'best* 
while  screening  out  all  the  bad  ones.  Therefore,  it  seems  reasonable 
that  any  loss  function  should  contain  two  components:  the  first  one 
depending  only  on  the  bad  populations  selected  in  the  subset  and  the 
second  depending  only  on  the  good  populations  which  are  not  selected 
in  the  subset.  Such  a loss  function  reflects  the  loss  due  to 
misclassification  of  good  or  bad  populations.  One  such  general  loss 

I, 

function  can  be  written  as  follows:  For  060  and  a U, 


L(0,a)  = LB^0i‘9[k]+A^  + LG^ei'e[k]+A^  (1-2.2) 

where  Lg(Lg)  is  a non-increasing  (non-decreasing)  function  such  that 
LB(y)  = 0 for  y ^ 0 and  Lg(y)  = 0 for  y < 0.  By  this  loss  function 
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we  mean  that  there  is  no  loss  for  the  correct  judgment,  the  loss 
LB^9i'e[k]+A^  incurred  by  selecting  a bad  population  ^ is  non-decreas- 
ing in  e.j,  the  parameter  associated  with  i^.  Similar  arguments  hold 
for  the  second  component  Lg(-)  of  the  loss.  It  is  easy  to  see  that  the 
loss  function  of  the  type  (1.2.2)  satisfies  (1.2.1).  With  this  loss 
function,  we  have  the  following  result. 

Lemma  1.2.2.  Assume  the  prior  distribution,  i(e),  of  e is  permutation- 
k 

ally  symmetric  one  and  the  loss  function  is  given  by  (1.2.2).  Then 
the  following  relation  holds. 


D.  - Di_1  >0,  i = 2,...,k-l,  (1.2.3) 

where  D.  = r (6..,,x)  - r (6.,x)  for  i = l,...,k-l. 

1 Tl+I-  t i • 

Proof.  It  follows  from  (1.2.2)  that  is  given  by 

°i  ‘ EtLBll!(k-()-6[k]+‘>l51  - E[LG(e(k-i)-e[k]tl»l5] 

= EU(0(k_.j  )-6[k]+A)  |x]  where  «,(•)  = Lg(  * ) - Lg(-). 


Therefore,  D^-D..  -j  can  be  written  as 

2 

Wi  ■ n(5)  .LI  [e<8(k-i)-8[k]+1>-e<9(k-i+i)-9[k],4):l 

J ^ j 

f(x,e)dT(e) 

■ "(i>/  tt(6(k-ir9[k]tl>-|i<9(k-i+])-e(k)*4>:i|:f<i!-e)- 

e -j 

f(x,ey)]dT(e) 

> 0, 


13 


where  0Q  = (0  €0k|e(k_.)  = e(k_.+1)},  0]  = (0  < e(k_i+1)}, 

k k 

02  = {0  £ © le(k_i)  > 9(k-i+l)}’  = .n  f(x1,e1),  n ( x ) is  a 

normalizing  factor  and  is  obtained  from  0 by  interchanging  the 
components  e(|<-i)  anc*  e(k-i+l)'  seconcl  equality  follows  from 

the  symmetry  of  t and  the  last  inequality  follows  from  the  MLR 
property  of  f(x,0)  and  the  monotonicity  of  «.(•)• 

From  Lemma  1.2.1  and  Lemma  1.2.2,  we  derive  a Bayes  rule  for  a 
loss  function  given  by  (1.2.2). 


Theorem  1.2.1.  Assume  the  loss  function  is  given  by  (1.2.2).  Then 
the  Bayes  rule  6*  wrt  a permutational ly  symmetric  prior  t is  given  by 


6*  = 6.*  for  i*  = min{i:  D.  >0,  i = l,...,k-1) 

i 1 _ 

where  min  <p  is  defined  to  be  k. 

Before  we  state  some  properties  of  the  Bayes  rule,  we  recall  the 
following  definitions  (see  Nagel  (1970)  and  Santner  (1975).) 


Definition  1.2.1.  Let  6 (x , i ) denote  the  individual  probability  of 
including  the  population  ir^  in  the  selected  subset  S(x).  Then  a 
selection  rule  6 is  called  just  if  and  only  if  6 (x , i ) £6(x',i) 
whenever  x.  <_  x'.  and  x.  x^  for  j f i. 

Definition  1.2.2.  A selection  rule  6 is  said  to  be  strongly  monotone 
if  and  only  if,  for  any  i = l,...,k, 

E0[6(x,i)]  is  + in  0^  when  all  other  components  of  0 are  fixed, 
is  + in  0.  (j  f i)  when  all  other  components  of  0 are 

J 

fixed. 


HfeC 
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Corollary  1.2.1.  Assume  that  ej,...,ek  are,  a priori,  independently 
identically  distributed  and  the  loss  function  is  given  oy  (1.2.2). 

Then  the  Bayes  rule  6*  in  Theorem  1.2.1  is  just  and  strongly  monotone. 


Proof.  It  follows  from  Theorem  1.2.1  that  the  Bayes  rule  6*  selects 

■ I 

Tri  if  and  only  if  xi  = x^-j  and/or  /t(y)dQ. (y| x)  < 0,  where 

1-/  n G(z+A-y |Xj)dG(z|x.)  if  y < A 
Qi(y|x)=  { ^ ) and  G( * | x. ) is 

1 otherwise 

the  posterior  cdf  of  e.,  given  x.  Therefore,  to  show  that  6*  is  just, 
it  suffices  to  show  that  Q..(y|x)  is  stochastically  smaller  than 
Q.  (y | x ' ) whenever  x..  <_  x!  and  x^  >_  x^j  for  all  j f i , since  l(  • ) is 
non-increasing.  This  follows  from  the  fact  that  G ( - 1 x ) >_  G( ■ | x ' ) 
if  x < x'.  Similarly,  we  can  show  *hat  6*  is  strongly  monotone  using 
a theorem  in  Barr  and  Rizvi  (1966). 

Remark  1.2.1.  If  a selection  rule  6 is  strongly  monotone,  then  6 is 
monotone  (see  Santner  (1975)),  i.e.  E0[fi(x,i)]  >_  E@[6(x,j)]  if  ej  • 


1.3  Loss  function  c,  [ ^ I ( _« , 0 ) Ce i "e [ k]+A )+c2  XW)(  V9[k]+A) 

Even  though  Theorem  1.2.1  describes  a Bayes  rule  for  the  loss 
function  of  the  type  (1.2.2),  we  need  a more  specific  loss  function 
as  well  as  more  assumptions  about  the  prior  distribution  of  0 to 
specify  the  Bayes  rule  more  explicitly.  Examples  of  the  loss  functions 
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satisfying  (1.2.2)  might  be  given  as  follows. 

L6(e1-e[k]ti)  ■ ^(9,-0^^)  ■ f'-3-') 

<V6[k]*4>- 

L8(Ve[k]*4>  ■ («,-9tk]+A)'.  lG(9re[k]*A)  = (1.3.2) 

where  I^(y ) is  the  usual  indicator  function,  y”  is  the  negative  part  of 
y,  y+  is  the  positive  part  of  y,  c-j  > 0 and  c^  > 0. 

From  now  on  we  will,  unless  otherwise  mentioned,  consider  only  the 
loss  function  given  by  (1.3.1)  and  assume  that  given  W = w,  e,,...,e^ 
are  iid  random  variables  with  a density  and  the  distribution  of  W is 
known.  Such  a prior  distribution  of  e will  be  called  exchangeable. 

Note  that  we  may  assume  that  c-j+c^  = 1 without  loss  of  generality,  and 
c-|  and  c^  can  be  interpreted  as  the  relative  weights  of  the  losses  due 
to  two  different  sources.  It  is  easy  to  see  that,  given  X = x and 
W = w,  e-'s  are,  a posteriori,  independently  distributed  and  the 
distribution  of  ei  depends  on  x only  through  x...  Let  G^-)  = G(  • Ix^.-j  ,w) 
and  H(w|x)  denote  the  posterior  cdf  of  e^,  given  x and  w,  and  the 
conditional  cdf  of  W given  x,  respectively.  The  bounds  on  in  the 
following  lemma  seem  to  be  useful  to  simplify  the  Bayes  rule. 


Lemma  1.3.1.  For  i = 1 , 
the  following  relations. 


, k- 1 , Di  = C1-P(6(k_i ) >.  6j-k-j-A  | x ) satisfies 


Di  = cl"//  ,n  Mz+A)dG.  . (z)dH(w|  x) 
1 1 jj=k-i  J 


16 


m*zz 


« 


Di  > c]  - //Gk(z+A)dGk_i(z)dH(w|x)  = c]-u1 (i ) , 

Di  >_  c1  - //Gk_i+1(z+A)dGk_i(z)dH(w|x)  = c^u^i)  and  (1.3.3) 

D.  < c]  - / /Gk2j-1  (z+A)Gk(z+A)dGk_^ (z)dH(w|x)  = Cj-VjO). 


Proof.  The  first  inequality  follows  from  the  fact  that  >_ 
the  next  inequalities  follow  from  the  fact  that  G^(-)  >_Gj(-)  for  i 
Next,  we  state  a theorem  giving  a simpler  version  of  the  Bayes 
rule  from  (1.3.3)  and  Theorem  1.2.1. 


and 
< j- 


Theorem  1.3.1.  Assume  the  loss  function  is  given  by  (1.3.1).  Then 
the  Bayes  rule  6*  wrt  an  exchangeable  prior  is  given  as  follows.  Let 
u(i)  = min{Uj(i),  u^( i ) > for  i = l,...,k-l,  then 

(i ) u(l)  £ C-|  =>  6*  = 6-j 

(ii)  let  iQ  = mi n{ i : u ( i ) £ c-j , i = 2 k-1 } and  jQ  = 

max{j|c1  < v i ( j ) , j=l . , i q-1 } , 

then  6*  = 6-*  where  i*  = min{m:  Dm  > 0,  jn+l  < m < in}. 

l m — u — - u 

Corollary  1.3.1.  For  k = 2,  the  Bayes  rule  6*  is  given  by 
6-j  if  //G2(z+A)dG1  (z)dH(w|x)  £ c1 

5*  ={ 

62  otherwise. 

Remark  1.3.1.  When  the  loss  function  is  given  by  (1.3.2),  it  is  easy 

to  see  that  D.  can  be  written  as  D,  = //[l-  n G,(z)]G.  .(z)dzdH(w|x)-A, 
1 1 jfk-i  3 k_1 

and  results  analogous  to  the  above  can  be  obtained  in  a similar  way. 


In  fact,  recently  Miescke  (1978)  has  studied  this  problem  with  such 
a loss  function  using  a different  approach. 

At  this  point  some  comments  about  the  case  for  k = 2 are  in 
order  because  of  the  special  structure  of  the  problem  in  this  case. 

As  stated  in  Corollary  1.3.1,  we  can  completely  specify  the  Bayes  rule 
in  this  case.  Furthermore,  we  can  specify  an  essentially  complete 
class  in  this  case  provided  we  make  the  further  assumption  that 
f(xi  ,6^  = f(xi-ei ) with  f(-)  being  the  density  wrt  the  Lebesgue 
measure.  The  loss  function  given  by  (1.3.1)  can  be  written  as 
follows. 

L(e,a) 


With  this  loss  function  the  problem  is  invariant  under  the  group 
of  translations  as  well  as  under  Sk.  Therefore,  it  seems  reasonable 
to  consider  rules  invariant  under  both.  Then  the  decision  rule  should 
depend  on  x only  through  x^-x2>  and  our  problem  becomes  a montone 
multiple  decision  problem  (see  Ferguson  (1967))  with  X.j-X2  having  pdf, 
say,  q( •-(e^-e2)) , given  e = (e^.e^.  Note  that  q((xj-x2)-(ej-e2))  has 
the  MLR  property  in  x-j-x2  and  e i -e^  (see  Ibragimov  (1956)).  Hence 
Theorem  6.1  of  Ferguson  (1967)  in  this  case  leads  to  the  following 


result. 
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Theorem  1.3.2.  For  k = 2,  assume  that,  has  pdf  f(x^-e.) 

(i  = 1,2)  given  e,  wrt  the  Lebesgue  measure.  Suppose  the  loss  function 
is  given  by  (1.3.1).  Then  the  following  rules  form  an  essentially 
complete  class  among  the  translation  and  permutation  invariant  rules: 
Rule  R^:  Select  if  and  only  if  x..  >_  x^-d,  d ^ 0. 

Corollary  1.3.2.  Under  the  assumptions  of  Theorem  1.3.2,  a minimax 

rule  Rd  is  given  as  follows. 

Qm 


c-j-values 

d -values 
m 

minimax  risks 

c]  > q(a) 

0 

1 -Q (a  ) 

Q(A-d0)  < c1  <\ 

d0 

c1Q(dQ-A)+(l-c1)Q(-d0 

otherwise 

A-Q^tc,) 

(1-c1)[c1+Q(-2a+Q"1(c 

Here,  Q(y)  = / q(x)dx  and  dQ  is  the  value  determined  by  c1q(d-A)  = 

— oo 

c2q(d+A). 

Proof.  From  Theorem  1.3.2  and  the  generalized  Hunt-Stein  theorem 
it  suffices  to  find  a d-value  which  minimizes  the  supremum  of  the 
risk  associated  with  the  rules  R^  for  d ^ 0.  By  invariance  under 
S^,  it  is  sufficient  to  consider  the  case  when  n = 6j-e2  i 0*  is 
easy  to  see  that 

c2[Q(-d-n)+Q(-d+n)l  if  0 jc  n 1 A 

E lL(_e,Rd)]  = ( 

^ Q(d-n)+c2Q(-d-n)  if  A < n. 

Furthermore,  for  0<  n < i, 
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~ E0[L(0,Rd)]  = c2[-q(-d-n)+q(-d+n)] 

■ - 'i 

^ c2q(d+n)[q[-'^|-  - 1]  = 0 by  the  MLR  property  of  q. 


Therefore,  sup  E0[L(e,Rd)]  = Max[c2Q(-d-A)+c2Q(-d+A) , c1Q(d-A)+c2Q(-d-A)]. 
0 “ 

The  first  term  in  the  right  side  is  non-increasing  in  d,  and  the 
derivative  of  the  second  term  wrt  d is  c-jq(d-A)-c2q(-d-A)  which  is 
non-negative  for  d >_  dg.  Hence  the  result  follows. 


Example  1.3.1. 

(A)  Two  normal  populations  with  unknown  means  and  a common  known 
variance: 

Suppose  X-j  and  X2  are  independent  normal  random  variables  with 

2 

means  and  e2>  respectively,  and  a common  known  variance  y > 0. 
Then  the  minimax  rule  Rrf  is  given  as  follows: 


c^-values 


*(-—)  <.  c,  < 1 

/2y 

A"dn  , 

V < * * 


otherwise 


d -values 
m 


minimax  risks 


dg  = Y2A_1log(c^-l) 
A -/2y  $ ^(C-j) 


SZy 

dg-A  -dg-A 

C ,*(-£-)  + (l-c,)$(-5-) 

JZy  /2y 

(l-c1)[c1+*(*"',(c1)-  ^)] 


Here,  $(•)  denotes  the  cdf  of  a standard  normal  distribution. 

(B)  Two  gamma  populations  with  a common  shape  parameter: 

Suppose  X-j  and  X2  are  independent  gamma  random  variables  with  a 


I 

- 


i 


i 5 


j 

J 
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I 


common  shape  parameter  a(a  > 0)  and  unknown  scale  parameter 

and  0^,  respectively.  A population  ir..  is  then  defined  to  be  good 

if  3.  >_  A-1  max  3.  and  bad,  otherwise,  where  a(a  > 1)  is  a positive 
1 1lJlk  J 

constant.  By  considering  the  associated  location  parameter  problem, 

we  can  get  the  following  minimax  rule  R.  , which  selects  tt  . if  and 

-1  m 
only  if  Xi  > dm  x[k],  dm  > 1. 


^-values 

d -values 
m 

minimax  risks 

F ( A ) < C^  < 1 

1 

l-F(A) 

FCAd'1)  < C]  < \ 

do 

c1F(d0A‘1)+(l-c1)F(d'1A-  ) 

otherwise 

1 

A/F"1 (C] ) 

(l-c1)[c1+F(F'1(c1)/A2)] 

_1  1_ 

Here,  dQ  = ( (c2/c1  )2ctA-l  )/(a-(c2/c1  )2a)  and  F is  the  cdf  of  the 
F-distribution  with  degrees  of  freedom  2a  and  2a. 


1.4  Further  simplification  of  the  Bayes  rule 


In  this  section,  we  make  more  assumptions  that  the  posterior 

■'blVbOw 

cdf  of  is  of  the  form  ( • |x^.  j ,w)  = G( ^ ) where  ti  - 

t(xj-..j)  is  an  increasing  function  of  x^,  b1  > 0 and  b2  > 0. 

To  simplify  the  forthcoming  formulae  we  introduce  the  following 
notations;  for  fixed  A*  and  for  i = l,...,k-l,  let 

lj(y)  • /G1 (y+z)dG(z) 

m. (y|a*)  * /Gk'1‘1(z+A*)Gi(y+z+A*)dG(z). 


a 


’ 


» 


It  follows  from  the  above  specification  of  the  posterior  that, 
for  i = 1 . ,k-l , 

b, 

°i  * cr/  j+J.,G(zt  r2  (tk-r‘j)+  ^)dG<2)- 

ui(i>  ‘ ‘VrV  * 

u2(1>  = (VrlM*i> + ^ a"d 

w i ( 1 ) ' Vbj  (tk-rtk)l5j]- 

The  following  well-known  results  are  stated  here  for  completeness 
for  providing  bounds  on  Di  (see  Hardy,  Littlewood  and  Polya  (1934), 
Theorem  43  and  Theorem  108). 


Lemma  1.4.1.  (1)  (Tchebycheff ) If  Z is  a random  variable,  then,  for 

non-decreasing  real-valued  functions  f1  and  f2>  E[f^(z)f2(z)]  i 
ECf-j (z)3E[fr2(z) ] provided  the  expectations  exist. 

(2)  (Karamata,  Schur)  If  f is  a convex  function  on  the  real  line, 
then  T(y)  = \ <f>(y.)  is  a Schur-convex  function  of  y = (y,.....y  ). 

i=l  1 - i P 

Lemma  1.4.2.  Under  the  assumptions  made  in  this  section,  the  following 
relations  hold. 


(1.4.2) 


c.j-u  (i)  < Di  < c1-v2(i),  i = 1 , . . . ,k-l 


where  u3(1)  = (tk-i‘  FT  1 j > + and 

v2(i)  ’ j^k-t  t,tK2(tk'rtj>  * 
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JZ 


Proof.  The  upper  bound  on  D.  is  an  immediate  consequence  of  the 

repeated  applications  of  (1)  in  Lemma  1.4.1.  To  obtain  the  lower 

bound  c.-u  (i),  it  suffices  to  show  that  G ( • ) is  log-concave,  since 
6 P 

the  log-concavity  of  6 implies  the  Schur-concavity  of  n G(y.)  in 

i = l 1 

y = (y-|»...,y  ) from  (2)  in  Lemma  1.4.1.  Let  g(e^jx.  ,w)  denote  the 
posterior  density  of  e^,  given  x and  w,  then,  for  y < y'  and  < x 1 , 

G (y' |xl,w)G(y|x.,w)-G(y|xl,w)G(y' Ix^.w) 

= P(0i  £ y|x.,w)P(y  < £ y'  |xl  ,w)-P(ei  £y|xl,w)P(y  < £ y * | xi  ,w) 

= //  [g(s|x.,w)g(t|x!,w)-g(s|x'.,w)g(tjx.,w)]dsdt 

S£y 
y<t<y 1 

>_  0 from  the  MLR  property  of  g(e.j  |x^,w)  in  and  x^  for  fixed  w. 

The  last  inequality  and  the  specii ied  form  of  G(-|x^,w)  imply  the 
log-concavity  of  G (see,  for  example,  Lehmann  (1959)  p.  330).  Hence 
the  proof  is  completed. 

The  next  result  is  an  immediate  consequence  of  Lemma  1.4.1  and 
Theorem  1.3.1. 

Theorem  T.4.1.  Assume  the  prior  of  e is  exchangeable  and  satisfies 
the  specification  of  the  posterior  cdf  in  this  section.  Then  the 
Bayes  rule  6*  for  the  loss  function  given  in  (1.3.1)  is  given  as 
follows . 

(l)  tk_,  < IMXltk  + ^ (ij’lc,)-  ^ tj  + 
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(ii)  Let  iQ  = min{i:  tk1.  < i = 2,...,k-l}  and 
jQ  = max{j:  tk_.  > tfc  + ^ (<=i  1^  or 

bi 

i K-j ‘'V'Wi  1 * By1  > °v  J ’ 1 V”  "herc 

ri " ”x(tk  * By  **i'*ci1_  sy*’  tk-i-n  * By  *“(  *ci*_  By1- 

pt  jt^  y ey  uk-i(ci>-  By11’ the" 4*  ’ V for 

i*  = min{i:  Di  >_  0,  jQ  + 1 £ i £ iQ) • 

Note  that  the  above  result  can  be  written  in  terms  of  rules  of 
the  following  types,  which  have  been  studied  in  the  past. 

Rule  6m(d):  Select  it(i)  iff  t.  = tmax  and/or  t.  > t^-d. 

Rule  6a(d):  Select  ir^  iff  t^  = tmax  and/or  ti  > j^y  F t^-d  , 

J | ^ 

Rule  6°(d):  Select  and  select  *((<_■)),••.  »ir(k_-j*+'] ) where 
i*  = min{i|tk  i < tmax~d(i),  i = 1 k-1 } and  d=(d (1 ), . . . ,d (k-1 )) . 


Here,  t = max  t.  = t(xr,i).  Rules  of  the  first  type  were  first 
max  l<j<k  J Ck] 

proposed  by  Gupta  (1956)  for  the  problem  of  selecting  a subset  which 
includes  the  'best'  population,  and  later  studied  by  Desu  (1970)  for 
selecting  a subset  consisting  of  only  good  populations.  Rules  6a  are 
modified  versions  of  the  rules  proposed  by  Seal  (1955)  and  studied  by 
many  others  for  the  selection  of  the  'best'  population.  The  last  rule 
6^(d)  is  of  the  type  studied  by  Brostrom  (1978).  Let  d-j  = 5 — 

sy  d2 ' by  ‘ By  ‘kVci> a,,d  d<1> = -By  "i,(cilBy> for 
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i = 1 , . . . ,k-l , then  6m(d1),  6a(d2)  and  6°(d)  are  the  ‘approximate1 

Bayes  rules  suggested  by  Theorem  1.4.1.  Since  v-,(i)  i 
b b 

‘"k-^b^k-rV^  = Vl^VrV*  6 (d2^  1S  a special 


case  of  6^(d).  Note  that  6m(d^ ) and  6a (d^)  (6m(d2)  and  6^  ( d ) ) select 
larger  (smaller)  subsets  than  the  Bayes  rule,  and  that  they  all 
coincide  for  k = 2. 


Corollary  1.4.1. 


For  k = 2,  the  Bayes  rule  6*  is  given  by 

61  lf  tl_t2  + - bj" 

62  otherwise. 


Thenext  result  is  helpful  in  eliminating  some  unnecessary  computations 
in  finding  the  Bayes  rule. 


Corollary  1.4.2.  If  c-j 
then  the  Bayes  rule  6* 


> /G1(z+  |-)dG(z)  for  some  i = 1 
selects  at  most  i populations. 


...k-l, 


1.5  Some  specific  examples 
(A)  Normal  populations 

Here  we  assume  that  ,. . . ,ir^  are  normal  populations  with  un- 

2 

known  means  and  a common  known  variance  o , and  that  we 

have  independent  samples  with  a common  sample  size  n for  each  popula- 
tion. By  sufficiency  we  can  reduce  the  problem  to  that  based  on 
sample  means  X^,...,Xk.  The  loss  function  is  assumed  to  be  given 
by  (1.3.1).  We  will  consider  a permutationally  symmetric  prior 
defined  as  follows. 


\ 


it 
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Given  W = w,  (e-|,...,9k)  have  a multivariate  normal  distribution 

2 


with  E ( 0 . ) = w,  Var(ei)  = g and  Cor(e.j,ej)  = p where  (1.5.1) 

p > -(k-1)-1  and  W has  a known  distribution  H(.)- 

A prior  of  the  above  type  with  p >_  0 has  been  used  in  Goel  and  Rubin 
(1977)  and  in  Chernoff  and  Yahav  (1977).  The  following  well  known 
representation  is  useful  for  reducing  the  above  prior  to  a simpler  one. 


Lemma  1.5.1.  If  Y^,...,Y|{  are  equally  correlated  normal  random 


variables  with  E(Y^)  = 0,  Var(Yi)  = 1 and  Cor(Y^.Yj)  = p,  i ,j  = l ,2,. . . ,k. 


v-1 


j f i,  where  -(k-1)  < p < 1,  then  Y/s  can  be  written  as 


Y.  = /Tp"  Z.  - (/p7  + )Zn  where  (Zn,...,ZJ  have  a multivariate 


normal  distribution  with  E(Z.j)  = 0,  Var(Zi-)  = 1,  i = 0,1,.  ..,k,  and 


C°r(VV  ■ ( 


F / /l -p  if  i = 0 < j £ 


0 


otherwise. 


The  next  result  follows  from  Lemma  1.5.1,  the  invariance  of  the 

loss  function  and  the  detailed  calculation  of  the  posterior  distribution. 


Lemma  1.5.2.  Let  6 be  a translation  invariant  rule,  i.e., 

6 (x)  = 6(x+bl)  for  any  real  b where  1 = (1,...,1)',  then  the  overall 
risk  r(x,6)  of  the  rule  6 wrt  the  prior  t given  in  (1.5.1)  can  be 
written  as 


r(i,6)  = //L(e,6(x))dN(e|(o02+n  o2)  ^n  a2x,(c>Q2+n  a2)  ^ I )dN 


(x|0,(n  ^o^Oq)!) 


(1.5.2) 


where  N ( - 1 p , z ) denotes  the  cdf  of  a multivariate  normal  distribution 


L 
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with  mean  p and  covariance  matrix  z,  I is  the  kxk  identity  matrix  and 

a q = (l-p)B  • 

It  should  be  pointed  out  that  similar  reduction  has  been  done 

in  Chernoff  and  Yahav  (1977)  for  the  case  p >_  0,  and  in  Gupta  and 

Hsu  (1978).  Note  that  the  right  side  in  (1.5.2)  is  the  overall  risk 

of  rule  6 when  e^,...,e^  are,  a priori,  iid  normal  random  variables 

2 

with  mean  0 and  variance  0q,  and  that  the  Bayes  rule  wrt  the  prior 
given  in  (1.5.1)  can  be  taken  as  translation  invariant.  Hence  we  can 
reduce  the  prior  given  in  (1.5.1)  to  this  iid  normal  prior.  Obviously 
the  posterior  cdf  of  satisfies  the  specification  of  \.ne  posterior 
cdf  in  Section  1.4  with  b-j  = (oQ  +no  ) no  , b2  = (aQ  +na  ) 2, 
bQ  = 0,  t.j  = x^.j  and  G(*)  = $(•)•  It  follows  from  the  preceeding 
that,  for  i = 1 ,. . . ,k-l , 


^(y)  = /$1(y+z)d$(z) 

m.(y|A*)  = /<t>k’i_1  (z+A*)^  (z+y+A*)d4>(z)  and 

°i  = Cr/jf^i$(Z+na"2(a°2+na"2)"a(X[k-i]"X[j])+( 


OQ2+na'2)2A) 

d$(z). 


(1.5.3) 


The  Bayes  rule  6*  can  be  obtained  by  numerically  integrating 
using  Gauss-Hermite  or  other  methods  of  quadrature  (we  eliminate 
unnecessary  computations  using  Theorem  1.4.1.).  Note  that 
Jij(y)  = 4>(y//2)  and  therefore  6*  selects  only  one  population 
c i >_  <t((oQ2+no  2)aA//2). 

The  'approximate'  Bayes  rules  suggested  in  Section  1.4  are 

2 , 

determined  by  ti  = x,  dj  = d^n.Og)  = a(1+  -^)  - 

°0 


if 
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C.  1 7 1 

0 • /I 


o2  1 , 


^ (1+  * (c-,),  d„  = d,(n,on)  = A(l+  2_^)  - 


£ I 1 1 

— (1+  ^ l (c] ) and  etc-  ^he  Bayes  rule  wrt  the  diffuse 

prior,  i.e.,  oq  -*■  °°,  is  very  often  of  interest,  and  it  as  well  as 


the  corresponding  'approximate'  rules  can  be  obtained  by  formally 


-2  2 

taking  oQ  = 0 or  equivalently  oQ  = °°  in  the  above.  From  these 

2 i j 

expressions  we  see  that  d^(n,aQ)  - d2(n,oQ)  = ^ (1+  ~ 

/n  oQ 

_ 1 i -2U 

(z^_1  (c.| )-/2  (c-|))  would  be  small,  especially  when  = 0,  for 


sufficiently  large  n.  Therefore,  one  might  expect  that  6m(d-|  (n ,oQ) ) 


and  6m(d2(n,o0))  would  be  close  to  the  Bayes  rule  for  sufficiently 


large  n. 


Let  us  consider  the  rules  6 (h(n,on,d))  for  various  values  of  d 

o U O 


? Up 


where  h(n,oQ,d)  = a(1+  ^ - — d(l+  2—  -^-p)5.  Let  denote 

°0  ^ °0 

an  event  (u^  )e^1tj : J e <5(x)})  for  any  rule  6,  and  r(c>0,6)  denote  the 

overall  risk  of  6 when  the  prior  is  given  by  (1.5.1).  Since 

(tt ( k_ i )f  <5rn(d-,  (n,aQ) ) ) c (Di  >.  0)  and  6m(d1(n,aQ))  c 6m(h(n,a0,d)) 


for  d < /2  $"  (c-j ) , 


r(oQ,6m(h(n,a0,d) ) ) - r(o0,6m(d.| (n,OQ) ) ) 


« EC  l V1 

1=1  1 (*/L.4 


m _I  n.  >3 

(k-i)fe6  (h(n>°o,d)))  ^(k-i)^6  (di(n><so^3 


k-1 

1"1  1 (*(k-i  )€6m(h(n,o0’d^  ,ir(k-i  )^<sm(di  ( n »aQ) 

> 0, 


where  is  defined  in  (1.5.3)  and  the  expectations  are  taken  wrt  the 
marginal  distribution  of  X's.  Similarly,  it  ran  be  shown  that 

r(a0,6m(h(n,o0,d)))  >_  r(o0,6m(d2(n,o0)))  for  d > fcj^Cj).  Therefore, 

m 1 

we  may  consider  the  rules  6m(h(n,OQ,d))  only  for  d 6 [/2  4>~  (c^), 

£k-l(cl)]  as  ^ong  as  the  overa^  risk  is  concerned.  Furthermore, 

denoting  the  Bayes  rule  for  sample  size  n by  6*  and  6m(h(n,ag,d)) 

by  6 01  for  fixed  d £ [/2  (c-| ) (c^ )] , it  is  easy  to  see  that 

*r'(o0,6m)-r(o0,6*) 

k-1 

= l °i{1  m -1  m jJ 

1=1  ^(k-ijK'^k-l}66  } (ir(k-i)66n>ir(k-i)^  } 

k-1 

iE[  [ (c,-^  i(d))I  +(*(d/^)-c,)I 

k-1 

< E[  l (♦(d/^)-4k,(d))I  _ m J 

("(k-ijfr5  (d2(n,a0)).7r(k-i  (d-j  (n.ag))) 

. Wd/<*).H.1(d)]V,Ft[LL  t-],(C ,).<,♦  siv  M - 

1-1  /n  0 Or,  0 


F(["  r'fcMl+S-Jy)4  J-]) 

1 ^a0  1 n On  °0 


T^l ) ! (k~i-l ) 1 / [/  ^1‘1(u+v)cp(u+v)[<i>(v)-$(u+v)]k'1"1 


cp(v)dv]du  if  y <_  0 


where  F.(y) 


if  y > 0, 


and  cp(')  denotes  the  pdf  of  the  standard  normal  distribution. 
The  first  inequality  follows  from  the  fact  that 
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1 


(l,(k-i)K’  "(k-i)^  c (Di  - xLk-i]  > *Lk]"h(n’a0,d)) 

c (0  i Dj  < c1-d|<_1  (d))  and 

<»(k-1)fe4n*  n(k-i)^m)  c (cl_*l(d)  1 Di  " °)- 

The  second  inequality  is  obvious  from  the  fact  that  6m(d2(n,OQ) ) <_ 

6*  c 6n(d1(n,oQ))  and  6m(d2(n,a0))  c 6m  c 6m(d1 (n,oQ) ) , and  the  last 

equality  follows  from  the  marginal  distributions  of  X^'s.  It  follows 

from  the  preceeding  that  lim  na[r(o0,6n’)-r(a0,6*)]  = o for  a t [0, 

n-*<o 

Hence  we  have  proved  the  next  result. 

Theorem  1.5.1.  Let  r(ctQ,6)  denote  the  overall  risk  of  a rule  6 wrt 

2 2 

the  prior  distribution  given  in  (1.5.1)  where  Oq  = B (1-p),  then 

> r(o0,6m(d1(n,o0)))  if  d</2^»"1(c1) 

(i)  r(o0,6m(h(n,a0,d)))  | 

i r(o0,6m(d2(n,o0)))  if  d>tk|1(c1) 
and 

(ii)  for  dfc[/2  *”1(c1).a^-|(c-|)],lim  naLr(o0>6m(h(n,o0,d)))-r(o0,  6*)]=0 

n-K» 

for  any  cs€|_0,  ^). 

It  is  interesting  to  note  that,  for  k = 2,  the  Bayes  rule  wrt  the 
diffuse  prior,  which  selects  it,,  iff  x.  = x and/or 

• I fna  X 

x-  > x -A  + — \/Z  <f"^(c, ),  coincides  with  the  minimax  rule  in  some 
1 ma*  1 

cases  (see  Example  1.3.1),  and  with  the  rule  studied  by  Desu  (1970) 
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if  c1  = P*.  Also  for  k > 2,  consider  the  rules  6m(h(n,°°,d) ) for 
various  values  of  d.  Before  we  state  some  properties  of  these  rules 
we  recall  the  following  definition  (see  Ferguson  (1967)). 


Definition  1.5.1.  A decision  rule  6q  is  called  an  extended  Bayes 
rule  if,  for  every  t > 0,  there  exists  a prior  t such  that 

r(T,60)  < inf  r(T,«)  + €. 

6 

It  can  be  easily  shown  that  any  extended  Bayes  rule  6Q  is 

t-admissible  for  any  t > 0,  i.e.,  there  does  not  exist  a rule  6 

for  which  R(e,6)  < R(9,6q)-€  for  all  e '=  8 , where  R( e ,6 ) = 

EcL(e,6(x)).  In  a manner  similar  to  the  one  in  the  proof  of 

n 

Theorem  1.5.1  we  can  show  that  lim  [r(aQ,6‘ (h(n,=>,d) )-r(oQ,6*)]=0 

o0^°° 

for  d fc  [/2  <t"  (c-j),  therefore,  we  have  the  following 

result. 

Theorem  1.5.2.  The  decision  rules  6rn(h(n,«>,d) ) for  d fc  [/?  *~^(Ci), 

^(C])],  which  selects  »i  1ff  xi  = xmax  and/or  xi  ” xmax'A+  ^ d‘ 
are  extended  Bayes  and  therefore  ^-admissible  for  any  fc  > 0. 

The  above  arguments  indicate  that  the  performance  of  rules  of 
the  type  6m  would  be  close  to  that  of  the  Bayes  rule  when  oQ  is 
large,  but  we  could  not  make  similar  arguments  for  the  other 
'approximate'  rules.  Hence  we  carried  out  Monte  Carlo  study  to  see 
how  well  these  rules  (studied  in  the  past  under  a different  framework) 
perform  compared  with  the  Bayes  rule.  The  results  of  the  Monte  Carlo 
study  are  given  in  the  next  section. 


v 
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(B)  Gamma  populations 

Here,  we  consider  a problem  of  selecting  good  populations  out  of 

k gamma  populations  in  terms  of  unknown  scale  parameters  based  on  nk 

independent  observations,  assuming  a common  known  shape  parameter. 

By  sufficiency  we  reduce  the  data  to  the  k independent  gamma  random 

variables  Xj,...,Xk  with  a common  known  shape  parameter  a (a  > 0) 

and  unknown  scale  parameters  B-|,.*.»8k,  respectively.  Population 

Tt . is  said  to  be  good  if  B-  >_  A-1  max  , and  bad,  otherwise.  Here  A 

l£Jlk  3 

is  a preassigned  constant  greater  than  1. 

We  also  assume  that  the  loss  structure  is  the  same  as  that  in 
(1.3.1),  i.e.,  the  loss  is  c-|  for  selecting  a bad  population  and  c2 
if  we  exclude  a good  one.  Further  it  is  assumed  that  B i ... . , B^ 
are,  a priori,  independently  distributed  inverse  gamma  random 
variables,  i.e.,  the  prior  pdf  of  B = (B^-.-.B^)  is 

k .a  -bB-1 

T (B)  = n [ — - — rrr  e ] , B.  > 0,  for  i = 1 ,. . . ,k  (1.5.4) 

' i = l r(a)Bf3  1 

where  a > 0 and  b > 0 are  known.  Then  it  is  easily  observable  that, 

given  x = (x-j , . . . ,xk) , B-|,...,8k  are,  a posteriori,  independent  and 

have  the  same  distribution  as  that  in  (1.5.4)  except  that  a and  b 

are  replaced  by  a + a and  b + x^,  respectively. 

It  can  be  easily  shown  that,  in  the  associated  location 

parameter  problems,  all  the  assumptions  in  Section  1.4  are  satisfied. 

Hence  we  have  the  results  analogous  to  those  in  Section  1.4  with 

the  following  modifications.  For  i = l,...k-l,  let  J..(y)  = 

J[l-G(yz)]1dG(z),  and  m.fylA-1)  = /[l-G(yz)]1[l-G(A_1z)]k'1‘1dG(z) 

0 1 0 
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where  G(-)  is  the  cdf  of  gamma  distribution  with  shape  parameter 
a + <t  and  scale  parameter  1.  Then,  for  i = l,...,k-l, 


D.  = c ,-/  n [l-GU-t'A 
1 1 0 J4k-i  3 k 1 

u(m  ■ «,uk 


u2m  ■ 

1 

Mi)  = 4.  1 ( 1 ( n t.)k-1) 

3 k 1 k_1  J4k-i  J 

v] (i)  = mi(tk  t^| i a"1 |a_1 ) and 

vP(  i ) = n 5.,(t,  t^.A-1),  where  t • 
2 j+k-i  3 k" 1 1 


(1.5.5) 


x[i]+b- 


Note  that  £-j(y)  = l-F(y)  where  F(.)  denotes  the  cdf  of  an  F random 
variable  with  degrees  of  freedom  2,'a-1--.)  and  2(a+a).  In  addition  to 
the  above  bounds  on  , we  provide  another  bound  in  the  following, 
using  the  fact  that  l-G(-)  is  log-concave  (log-convex)  if  a + a > 1 
(a  + a < 1,  respectively). 


>i  { 


(1.5.6) 


Therefore  a result  analogous  to  Theorem  1.4.1  can  be  obtained  with 
obvious  modifications  from  (1.5.5)  and  (1.5.6).  In  this  case  also,  the 
Bayes  rule  can  be  found  by  numerically  integrating  Di  using 
Gauss-Laguerre  quadrature  while  we  eliminate  unnecessary  computations 
using  the  result  analogous  to  Theorem  1.4.1.  Note  that,  for  k = 2, 


the  Bayes  rule  wrt  the  diffuse  prior,  i.e.,  a •+  0 and  b -*■  0, 
coincides  with  the  minimax  rule  in  some  cases  (see  Example  1.3.1.). 
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1.6  Results  of  the  Monte  Carlo  study  for  the  normal  populations. 

In  this  section  we  are  assuming  the  normal  model  in  Section 
1.5  (A).  In  the  preceeding  sections  we  have  seen  that  rules  studied 
in  the  past  help  to  find  the  Bayes  rule  and  that  they  are,  in  some 
sense,  natural  approximations  of  the  Bayes  rule.  Among  them  6rn  and 
6a  are  perhaps  the  best  well-known  selection  rules.  Hence  it  would 
be  worthwhile  to  investigate  the  performance  of  these  rules  in  this 
Bayesian  framework,  since  they  have  their  own  merits  and  are  also 
easy  to  use.  Some  optimalities  of  these  rules  can  be  found  in  the 
literature  by  Gupta  and  Studden  (1966),  Berger  (1977),  Gupta  and 
Miescke  (1978),  Bjjirnstad  (1978),  Chernoff  and  Yahav  (1977),  Gupta 
and  Hsu  (1978)  among  others.  Especially  the  last  two  papers  are  much 
related  to  our  work  in  that  they  studied  the  performance  of  these 
rules  in  Bayesian  framework  for  the  problem  of  selecting  a subset 
containing  the  'best'  population. 

For  our  Monte  Carlo  study  we  may  assume  that  o//n  = 1 without 

any  loss  of  generality.  We  recall  that  6m  and  <5a  can  be  written  in 

the  following  more  familiar  forms: 

6m(d):  Select  tt . if  and  only  if  x.  > x -d,  d > 0 

l J l — max  — 

a (1-6.1) 

6 (d):  Select  n.  if  and  only  if  x.  = x and/or  x.  > 

' ^ J l max  i - 
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We  carried  out  the  Monte  Carlo  study  for  the  cases  k = 3 and  k = 9. 


2 

The  remaining  relevant  parameters  in  this  study  are  c^  (or  c^) , og  and 
a.  We  use  c = c^/c^  for  the  tabulation  purpose  since  c,  being  the 
ratio  of  two  different  types  of  losses,  seems  more  appealing  than  c-| . 
The  ranges  of  the  parameter  values  which  were  studied  are  as  follows. 

1'  A = .25,  .5,  1.0,  o0  = (1.5)i(1  = -2(1)6), 

c = 2^  (i  = 1,...,5)  for  both  A = .25  and  A = .5 

c = 21'2  (i  = 1 ....  ,6)  for  a = 1.0. 

For  each  of  parameter  sets  (c,  oQ,  a),  400  simulations  were  carried 

out  for  k = 3 and  100  simulations  were  performed  for  k = 9.  In 

each  simulation  the  generation  of  the  random  vector  x = (x^,...,x^) 
according  to  its  marginal  distribution  was  involved,  and  then  the 
Bayes  actions  and  the  corresponding  risks  were  obtained  by  numerically 
integrating  's  in  (1.5.3).  The  optimal  values  of  d in  6m  and  6a  are 
estimated  by  minimizing  the  average  regrets  corresponding  to 
sufficiently  fine  grids  of  the  estimated  constants  d,  where  the 
range  of  these  trial  values  are  determined  from  the  preliminary 
computations  and  Theorem  1.5.1. 

The  estimated  Bayes  risks,  the  estimated  regrets  incurred  by 
the  optimal  6m  and  6a  are  given  in  Table  I at  the  end  of  this  chapter 
along  with  sample  standard  deviations  of  these  estimates.  For  those 
obvious  cases  when  the  Bayes  rule  selects  only  one  population  or  all 
the  populations,  we  did  not  tabulate  and,  as  a result,  these  cells 
are  left  blank  in  the  table.  Table  II  gives  the  average  number  of 
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bad  populations  selected  and  that  of  the  good  ones  excluded  for  the 
rules  considered,  along  with  proportions  of  times  that  the  optimal 
6 and  the  optimal  6 coincide  with  the  Bayes  rule.  From  Table  I, 
we  can  observe  that  the  performance  of  the  rule  6m  is  almost  as  good 
as  that  of  the  Bayes  rule  throughout  the  cases  studied,  and  that  it 
becomes  remarkably  better  as  the  prior  variance  (og  = e ( 1 -p ) ) becomes 
large.  This  agrees  with  the  argument  in  Section  1.5.  Also  we  observe 
that,  for  k = 3,  rule  6a  performs  reasonably  well  when  the  prior  is 
concentrated,  and  in  fact  it  performs  better  than  6m  in  a few  extreme 
cases  when  c is  very  large.  However,  the  performance  of  «a  is  poor 
for  moderately  large  oQ.  Especially,  for  k = 9,  the  performance  of  63 
is  disastrous  and  it  was  observed  that,  for  most  values  of  Og,  optimal 
6a  selects  only  one  population  for  small  values  of  c and  it  tends  to 
select  much  larger  subsets  than  the  Bayes  rule  as  soon  as  c becomes 
large  (roughly  c > 4).  Overall,  the  rule  6a  performs  rather  poorly 
when  k = 9. 

Similar  behavior  of  the  rule  6m  has  been  observed  in  Chernoff 
and  Yahav  (1977),  and  in  Gupta  and  Hsu  (1978)  for  the  problem  of 
selecting  a subset  containing  the  'best'  population.  On  the  otherhand 
performance  of  6a  is  worse  than  that  observed  in  Gupta  and  Hsu  (1978), 
and  it  seems  that  rule  63  has  little  to  recommend  for  the  goal  of 
selecting  'good'  populations  while  rule  6m  performs  almost  as  well 
as  the  Bayes  rule  provided  the  value  of  d is  chosen  properly.  This 
indicates  that  the  proper  use  of  rule  6m  can  lead  to  efficient 


statistical  method  since  it  behaves  fairly  well  in  various  formulations 
and  also  is  easy  to  use  and  interpret.  However,  we  point  out  that  the 
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choice  of  d should  depend  on  the  loss  structure  of  the  particular 
problem  at  hand,  and  we  suggest  the  tabulation  of  the  operating 
characteristics  such  as  the  number  of  bad  populations  selected  or 
that  of  excluded  good  ones  before  setting  P*  based  on  an  intuitive 
feeling,  if  one  wants  to  use  this  rule.  For  this  reason  the 
estimated  optimal  d-values  for  rules  6m  have  been  provided  in  Table  III. 
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TABLE  I BAYES  RISKS  AND  REGRETS 

THE  ENTRY  ON  TOP  OF  EACH  BOX  IS  THE  ESTIMATED  BAYES  RISK  AND  THE  NUMBERS 
IN  THE  SECOND  AND  THIRD  ROW  ARE  THE  REGRETS  INCURRED  BY  THE  OPTIMAL  6m 
AND  THE  OPTIMAL  6 a IN  THAT  ORDER. THE  NUMBERS  IN  THE  PARENTHESES  ARE 
THE  SAMPLE  STANDARD  DELATIONS  OF  THE  ESTIMATES. 


K=3.  A =.25 


Kc 

°0>S 

1 

2 

4 

8 

16 

.5516 

(.0050) 

.4612 

(.0024) 

.2970 

( .0006) 

.1659 

(.0003) 

.44 

.0046 

(.0006) 

.0028 

(.0005) 

.0002 

(.0001) 

.0000 

( .0000) 

.0185 

(.0013) 

.0131 

(.0013) 

.0002 

(.0001) 

.0000 

(.0000) 

.4801 

(.0061) 

.4337 

(.0045) 

.3049 

(.0020) 

.1807 

(.0006) 

.0971 

(.0002) 

.S7 

.0039 

(.0006) 

.0049 

(.0006) 

.0016 

( .0004) 

.0006 

(.0002) 

.0001 

(.0001) 

.0023 

(.0004) 

.0087 

(.0010) 

.0015 

( .0004) 

.0006 

(.0002) 

.0000 

( .0000) 

.3657 

(.0081) 

.3442 

(.0066) 

.2712 

(.0038) 

.1728 

(.0017”) 

.0964 

(. 0008) 

1.00 

.0025 

(.0001) 

.0041 

(.0006) 

.0020 

( .0004) 

.0014 

(.0003) 

.0005 

(.0001) 

.0028 

(.0006) 

.0087 

(.0013) 

.0027 

( .0005) 

.0014 

(.0003) 

.0008 

(.0002) 

.2675  (.0083) 

.2651  (.0079) 

.20213 

(.0046) 

.1340 

(.0028) 

. 0827 

(.0014) 

1.50 

.0014 

(.0004) 

.0031 

(.0005) 

.0020 

(.0004) 

.0014 

(.0003) 

.0009 

(.0002) 

.0026 

(.0006) 

.0197 

(.0023) 

.0080 

( .0013) 

.0037 

(.0007) 

.0019 

1.0004) 

.2087  ( .0090) 

.1883  (.0073) 

.1414 

(.0053) 

.0999 

(.0032) 

.0592 

(.0017) 

2.25 

.0009 

(.0002) 

.0007 

(.0002) 

.0013 

(.0003) 

.0010 

(.0002) 

.0004 

( .0001 ) 

.0024 

(.0005) 

.0250 

(.0028) 

.0178 

(.0022) 

.0093 

(.0012) 

.0050 

( .0007) 

.1254 

(.0078) 

.1280 

(.0070) 

.0969 

(.0048) 

.0699 

(.0031) 

.0360 

(.0018) 

3.38 

.0004 

(.0002) 

.0005 

(.0002) 

.0003 

(.0002) 

.0008 

(.0002) 

.0004 

( .0001) 

.0011 

(.0003) 

.0222 

(.0028) 

.0337 

( .0035) 

.0199 

(.0018) 

.0104 

(.0010) 

.0805 

(.0065) 

.0923 

(.0062) 

.0680 

(.0046) 

.0498 

( .0028) 

.0245 

(.0016) 

5.0£ 

.0000 

(.0000) 

.0005 

(.0003) 

.0009 

(.0003) 

.0002 

( .0001 ) 

.0000 

( .0000) 

.0011 

(.0004) 

.0142 

( .0022) 

.0370 

( .0043) 

.0245 

(.0021) 

.0125 

(.0011) 

.0645 

(.0064) 

.0534 

(.0055) 

.0447 

(.0039) 

.0317 

(.0024) 

.0210 

( .0014) 

7.59 

.0000 

( .0000) 

.0001 

(.0001) 

.0003 

(.0002) 

.0000 

( .0000) 

.0000 

( .0000) 

.0016 

( .0004) 

.0104 

( .0020) 

.0229 

(.0035) 

.0315 

(.0032) 

.0162 

(.0013) 

.0417 

(.0050) 

.0437 

(.0047) 

.0239 

(.0027) 

.0227 

(.0021) 

.0090 

( .0010) 

11.39 

.0000 

( .0000) 

.0000 

( .0000) 

.0000 

( .0000) 

.0000 

( .0000) 

.0000 

(.0000) 

^516 

( . OOPS) 

.01 10 

(.0021) 

.0126 

( .0027) 

.0264 

(.0039) 

.0185 

( .0014) 

K=9. 

A = .25 

1 

2 

4 

8 

16 

1.2362 

(.0131) 

1.1135 

( .0090) 

.7255 

(.0036) 

.4019 

(.0010) 

.44 

.0109 

(.0023) 

.0182 

( .0034) 

.0083 

(.0013) 

.0016 

(.0005) 

.0287 

(.0043) 

.2674 

( .0099) 

.0426 

( .0041) 

.0047 

( .0011 ) 

.9353 

(.0157) 

.8451 

( .0169) 

.6359 

(.0081) 

.3854 

( .0038) 

.87 

.0117 

(.0033) 

.0203 

(.0031) 

.0153 

(.0023) 

.0104 

( .0023) 

.0201 

(.0038) 

.1850 

( .0085) 

.1785 

( .0088) 

.0459 

(.0042) 

.7238 

(.0213) 

.6007 

( .0168) 

.4881 

(.0125) 

.2947 

( .0058) 

1.00 

.0124 

(.0029) 

.0147 

(.0028) 

.0136 

(.0024) 

.0085 

( .0014) 

.0201 

(.0038) 

.1235 

(.0104) 

.2617 

(.0193) 

.1171 

( .0063) 

.4943 

(.0233) 

.4536 

(.0181) 

.3489 

(.0125) 

.2221 

(.0074) 

1.50 

.0069 

(.0025) 

.0117 

(.0025) 

.0091 

(.0019) 

.0075 

(.0018) 

.0227 

(.0041) 

.0627 

(.0075) 

. 0B04 

(.0105) 

.0510 

(.0066) 

.3370 

(.0199) 

.2939 

(.0190) 

.2095 

(.0113) 

.1484 

( .0075) 

2.25 

.0070 

(.0019) 

.0095 

(.0029) 

.0030 

(.0011) 

.0059 

( .0013) 

.0222 

(.0043) 

.0662 

(.0101) 

.0572 

(.0105) 

.0419 

( .0054) 

.2480 

(.0208) 

.1814 

(.0161) 

. 1352 

( .0103) 

.0853 

( .0062) 

3.38 

.0030 

(.0011) 

.0028 

( .0010) 

.0029 

( .0009) 

.0020 

(.0007) 

.0189 

(.0043) 

.0648 

(.0101) 

.0708 

( .0090) 

.0478 

(.0081) 

H .1416 

( .0171") 

71148 

(.0120) 

.0964 

( .0092) 

.0700 

(.0057) 

5.06 

.0024 

(.0012) 

.0012 

( .0008) 

.0019 

( .0008) 

.0024 

( .0007) 

.0186 

(.0047) 

.0527 

(.0097) 

.0609 

( .0102) 

.0523 

(.0087) 

.0942 

(.0131) 

.0825 

(.0106) 

.0529 

(.0067) 

.0296 

< .0034) 

7.59 

.0005 

(.0004) 

.0011 

(.0006) 

.0009 

(.0006) 

.0001 

(.0001) 

.0098 

( .0038) 

.0384 

( .0084) 

.0444 

( .0091 ) 

.0340 

( .0049) 

.0599 

(.0111) 

.0575 

( .0093) 

.0393 

( .0056) 

.0170 

(.0027) 

11.39 

.0000 

(.0000) 

.0000 

( .0000) 

.0001 

(.0001  ) 

.0000 

( .0000) 

.0151 

( .0046' 

.037,' 

( .0086) 

. 0536 

( .0081 ) 

. 0834 

( . (HH'J  ) 

Mjt*  M. 
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TftBLE  I (CONTINUED) 


K=3.  A =.50 


1 

2 

4 

8 

16 

.4609 

( .0016) 

.3297 

(.0012) 

.1995 

(.0008) 

.44 

.0024 

(.0005) 

.0006 

(.0002) 

.0000 

(.0000) 

.0334 

( .0029) 

.0033 

(.0008) 

.0001 

(.0001) 

.4G86 

(.0053) 

.3897 

(.0026) 

.2553 

(.0011) 

. 145G 

(.0001) ^ 

.0773 

(.0003) 

.B  7 

.0038 

( .OOOG) 

.0029 

(.0005) 

.0010 

(.0003) 

.0001 

(.0001) 

.0000 

(.0000) 

.048? 

(.0028) 

.0185 

(.001?) 

.0024 

( .OOOG) 

.0000 

(.0000) 

.0000 

(.0000) 

.3872 

( . 0076) 

.3424  ( . 0056) 

.2511 

(.00285 

.1532  (.0014) 

.0869 

(.0006) 

1.00 

.0036 

(.OOOG) 

.0027 

(.0005) 

.0022 

(.0004) 

.^011 

< .0003) 

.0006 

(.0001) 

.0175 

(.0018) 

. 01G0 

(.0016) 

.0053 

(.0008) 

.0016 

(.0004) 

.0006 

(.0002) 

.2813 

( .0088) 

.2710 

TT0068) 

.2005 

(.0045) 

. 1361 

(.0024) 

.0603 

(.001  IT 

1.50 

.0027 

(.OOOG) 

.0023 

(.0005) 

.0020 

(.0004) 

.0009 

(.0002) 

.0008 

(.0002) 

.0115 

(.0015) 

.0180 

(.0024) 

.0078 

(.0013) 

.0033 

( . OOOE) 

.0014 

(.0003) 

.2033 

( .0083) 

.1833 

(.0072) 

. 15G1 

( .0049) 

.0987 

(.0030) 

.0590 

(.0016) 

3.25 

.0010 

(.0005) 

.0010 

(.0004) 

.0011 

( .0003) 

.0009 

( .0003) 

.0004 

( .0001) 

.0103 

( .0014) 

.0318 

(.0034) 

.0143 

( .0020) 

.0082 

(.0011) 

.0048 

( .OOOG) 

. 1470 

( .0083) 

.1274 

(.0065) 

r .1001 

(.0050) 

. 0GG2 

(.0028) 

.0439  (.00'^) 

3.38 

. OOOG 

(.0002) 

.0004 

(.0002) 

.0007 

( .0003) 

.0005 

(.0002) 

0004 

( .0001) 

.0084 

(.0013) 

.0343 

(.0038) 

.0313 

(.0030) 

.0159 

(.0016) 

0079 

( .0009) 

.0984 

( .00'72) 

.1207 

(.0060) 

. 072G 

(.0044) 

.0457 

(.0025) 

.0310 

(.0016) 

5. OB 

.0000 

( -OOuO) 

.0008 

(.0004) 

.0000 

(.0000) 

.0002 

(.0001) 

.0001 

(.0000) 

.0045 

( .0009) 

.0241 

(.0032) 

.0379 

(.0039) 

.0203 

(.0020) 

.0122 

(.0011) 

. 0G63 

( .0061) 

.0564 

(.0050) 

.0462 

(.0033) 

.0298 

(.0021) 

.0179 

( .0013) 

7\59 

.0003 

(.0001) 

.0002 

(.0001) 

.0002 

(.0001) 

.0000 

( .0000) 

.0002 

(.0001) 

.0045 

( .0010) 

.0184 

(.0030) 

. 0299 

(.0041) 

. 027? 

(.0052) 

.0150 

(.0012) 

"".0335 

(.0043) 

.0332  (.0043) 

.028? 

( .0030) 

.0200 

(.0019) 

.0118 

(.0011) 

11.39 

.0000 

(.0000) 

.0000 

(.0000) 

.0000 

( .0000) 

.0001 

( . ooui ) 

.0000 

( .0000) 



. 002G 

( .0013) 

.0103 

Jj.0022) 

.0243 

(.0041) 

.0300 

(.0038) 

L.  0215 

_(_jOQJ4J_ 

K-9.  A =.50 




°0^:J 

1 

2 

' 



~ 4 

8 

16 

1.5947 

( .0148) 

1.5322 

(.0119) 

1.0429  (.0037) 

.5953 

(.0016) 

.44 

.0198 

( .0033) 

.0296 

( . 0046 ) 

.0070 

(.0018) 

.0011 

(.0001) 

. 0C03 

( .0052) 

.2614 

(.0170) 

.0306 

( .0049) 

.0017 

( .0006) 

1.1473 

(.0133) 

1.2235 

(.0173) 

.9849 

(.0118) 

.6407 

(.0091) 

.3672 

(.0018) 

.G? 

. 0083 

(.0021) 

.0223 

( .0041  ) 

. 0222 

( .0031  ) 

.0112 

( .uorj) 

.0042 

(.0012) 

.0083 

( .0021) 

.2021 

( .0099) 

.3029 

(.0146) 

.0735 

( .0065) 

.0110 

( .0020) 

. 7783 

( . 0234 ) 

.8272 

( .0240) 

.7342 

(.0179) 

.5097 

(.0106) 

.3219 

(.0055) 

1.00 

.0072 

( .0028) 

.0233 

( .0039) 

.0191 

( .0035) 

.0138 

( .0026) 

.0068 

(.0019) 

.0108 

( .0029) 

.1071 

f .0092) 

.3424 

(.0164) 

.275? 

( .0186) 

.0931 

(.0062) 

( . 023? ) 

.6126 

(.0261) 

.5207 

( .0194) 

. 31)85 

(.01U,) 

.249? 

(.0066) 

1.50 

.0064 

( . 0U8i ) 

.0186 

(.0041) 

.0107 

( .0024) 

.0101 

( .0083) 

.0095 

(.0016) 

.0096 

( .0028) 

.0654 

(.0078) 

.2469 

(.0148) 

.4232 

( .0260) 

.1834 

( .0073) 

.4021 

( .0249) 

.3444 

(.0202) 

.3339 

r.onn  > 

.2324 

(.0124)'" 

. 1500 

(.0076) 

2.25 

.0043 

( .0018) 

.0051 

( .0024) 

.0062 

( .0019) 

. 00C7 

( .0016) 

.0041 

( .0009) 

,0054 

( .001G) 

.0443 

( .0068) 

.1223 

(.0141) 

.1368 

(.017?) 

. 1280 

( .0156) 

.2420 

( .0203) 

.2662 

(.0211) 

.1925 

(.0154) 

.1559 

( .0120) 

.0954 

(.0067) 

3.38 

.0015 

( .0010) 

.0032 

( .0010) 

.0032 

(.0013) 

.0019 

( .0007) 

.0023 

(.0007) 

.0053 

( .0018) 

.0424 

(.0081) 

.0815 

(.0130) 

.1092 

( .0136) 

.0771 

(.0163) 

. 1868 

(.0191) 

.1284 

(.0150) 

.1590 

(70152) 

.0924 

( .0093) 

.0723 

(.0056) 

5.0G 

.0004 

( .0003) 

.0005 

( .0004) 

.0032 

(.0012) 

.0017 

( .0008) 

.0008 

(.0004) 

.0128 

(.0029) 

. 0344 

( .0071 ) 

.0683 

(.0)09) 

.0880 

(.0134) 

.0727 

( .0114) 

.1114 

(.0153) 

. 0972 

(.0134) 

" .If OT  (.0123)  ' .0583 

(.00717" 

.0333 

(.0036) 

7.59 

.0001 

( .0001  ) 

.0012 

(.0009) 

.0003 

(.0005) 

.0003 

( .0003) 

.0000 

(.0000) 

.0041 

(.0021) 

.0337 

( .0078) 

.0852 

(.0143) 

. 0783 

( .0121) 

.0382 

(.0074) 

.0622 

(.0121) 

.0754 

(.0115) 

. 0639 

(.0113) 

.0421 

( .00647" 

.0853 

(.00357 

11.59 

.0001 

(.0001  ) 

.0004 

(.0000) 

.0014 

( .0007) 

.0000 

(.0000) 

.0002 

(.0002) 

. 0059 

( .0026) 

.0236 

( .0069) 

.0441 

(.0099) 

.0629 

1,0)07) 

,0491 

( , 00-732. 
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CHAPTER  2 

r-MINIMAX  RULES  FOR  PARTITIONING 
TREATMENTS  WITH  RESPECT  TO  A CONTROL 

2.1  Introduction 

In  many  fields  of  research  one  is  faced  with  the  problem  of 
comparing  k experimental  categories  with  reference  to  a 'standard' 
or  a 'control'.  Following  the  initial  investigation  by  Paulson 
(1952),  this  problem  has  been  studied  in  several  different  formulations 
by  Dunnett  (1955),  Gupta  and  Sobel  (1958)  and  Lehmann  (1961)  among 

• j 

others.  Tong  (1969)  has  studied  a problem  where  the  treatment 
populations  are  to  be  partitioned  into  two  cets,  one  consisting  of 
'better'  populations  and  another  consisting  of  'worse'  populations. 

Later  Randles  and  Hollander  (1971)  applied  r-minimax  principle  to  the 
same  problem. 

t j 

Let  u-j , . . . ,n^  denote  the  k experimental  categories  or  'treatment' 

populations  and  let  TtQ  denote  the  'control'  population.  We  assume  that 

each  population  is  characterized  by  a real-valued  location  parameter 

9.  (i  = 0,1, ...,k).  We  consider  a problem  in  which  the  treatment 
l 

populations  1^,...,*^  are  to  be  classified  as  'better'  than,  'worse' 
than  or  'close'  to  the  control  ttq  if  the  corresponding  parameter 
values  are  much  larger  than,  much  smaller  than  or  sufficiently  close 

. 

* 

i 

>| 

a 
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to  the  value  of  eQ.  Similar  problems  have  been  considered  in 
Bhattacharyya  (1956,1958)  and  in  Seeger  (1972)  when  e0,...,e^  are 
means  of  independent  normal  distributions.  We  apply  the  r-minimax 
principle  to  this  problem,  r-minimax  principle  is  known  as  one  of 
the  techniques  for  the  use  of  incomplete  prior  information.  It  is 
assumed  that  although  a prior  distribution  on  the  states  of  nature 
is  not  available,  it  is  known  to  belong  to  some  family,  r,  of 
distributions.  It  then  requires  the  decision  maker  to  select  a 
decision  rule  which  minimizes  the  supremum  of  the  overall  Bayes  risk 
over  distributions  in  r.  Such  a rule  is  called  a r-minim  x rule. 
Such  a principle  was  first  used  by  Robbins  (1951)  and  independently 
by  Hodges  and  Lehmann  (1952)  and  Menges  (1966).  The  name  r-minimax 
was  first  used  by  Blum  and  Rosenblatt  (1967).  This  principle  has 
been  applied  to  various  problems  by  Tackson,  O'Donovan,  Zimmer  and 
Deely  (1970),  Solomon  (1972a,  1972b),  DeRouen  and  Mitchell  (1974), 
Gupta  and  Huang  (1975,1977)  .Berger  (1977)  and  Miescke  (1979). 

In  Section  2.2  definitions  and  notations  are  introduced,  and 
a formulation  of  the  problem  is  given.  The  loss  function  and  the 
incomplete  prior  are  introduced.  Results  analogous  to  standard 
ones  on  minimaxity  are  given  to  help  find  r-minimax  decision  rules. 

In  Section  2.3  a r-minimax  decision  rule  is  derived  for  the 
case  in  which  is  known  and  a minimax  decision  rule  is  found  for 
the  same  case  when  a specific  loss  function  is  assumed.  A solution 
is  provided  for  the  example  in  which  0Q».-.,ek  are  unknown  means 
of  normal  distributions. 


In  Section  2.4,  the  case  in  which  the  control  population 
parameter  9q  is  unknown  is  treated.  Rules  are  derived  which  are 
r-minimax  among  procedures  for  which  the  classification  of  the 
i-th  population  depends  only  on  xi  and  xQ  where  xQ,...,xk  are 
independent  random  variables  representing  populations  Hg,...,^, 
respectively.  Specific  examples  are  again  given.  One  example  is 
to  classify  normal  populations  by  their  locations  (means)  wrt  the 
mean  of  a normal  control  population.  Another  is  to  classify  normal 
populations  by  their  variances. 

Section  2.5  consists  of  comparisons  of  r-minimax  rules  with 
Bayes  rules  wrt  independent  normal  priors  for  the  case  of  normal 
populations  with  common  known  variance. 

2.2  Formulation  of  the  problem 

Let  Xq,X.j  ,. . . ,Xk  be  k+1  independent  random  variables  representing 
the  control  Wq  and  the  k treatment  populations  ir^ respectively , 
with  X.j  having  probability  density  function  f^ (x-e^ ) with  respect  to 
the  Lebesgue  measure  on  the  real  line  R where  e.  € 0 = R,  i = 0, 

1 k.  The  random  variables  XQ,...,Xk  may  represent  sufficient 

statistics  or  other  statistics  based  on  which  we  wish  to  make 
statistical  decisions.  We  assume  that  each  f^(-)  (i  = 0,1,. ..,k)  is 
symmetric  about  the  origin  and  strongly  unimodal , i.e.,  f^(-)  is 
log-concave  on  the  real  line.  Hence  f^(x-e^)  has  the  monotone 
likelihood  ratio  (MLR)  property.  Obviously,  we  do  not  need  any 
observations  from  when  0Q  is  assumed  known;  therefore,  it  will  be 
understood  that,  in  such  a case,  the  random  variable  XQ  is  deleted  from 


our  consideration. 
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The  action  space  G can  be  written  as  G = Gj  G^  where 

ai  = {1,2,3}  for  i = 1 ,. . . ,k.  The  action  a = (a1 ,. . . ,ak)  6 G is 
to  be  interpreted  in  such  a way  that,  for  i = l,...,k,  the  treatment 
population  ir..  is  classified  as  'worse'  if  a..  = 1,  'close'  if  a..  = 2 
and  'better'  if  ai  = 3.  The  loss  L(e,a)  incurred  by  the  action  a € G 
for  0 = (eg.ei .... ,6^)  is  assumed  to  be  of  the  following  form. 


L ( 0 ,a)  = l L.(0,a.)  (2.2.1) 

i=l  1 ' 1 

where  L..(e,a.)  is  defined  below  and  denotes  the  loss  in  earh 
component  problem  incurred  by  the  component  action  a.. . For  arbitrary 
but  fixed  positive  constants  AJ  and  a2  such  that  Ai  < A^,  we  define 
five  disjoint  and  exhaustive  subregions  Rw>  Rj  , Rc,  Rj  , Rg  of  the 
real  line  R by  Ry  = (-<*>,  -a2J,  Rj  = (-a2,-Aj),  Rc  = [-Aj,Aj], 

Rt  = (a,,a0)  and  RD  = [ao,°0,  and  derine  L.(0,a.)  as  in  the  next 

1 ^ I L D c 1 “ I 

table. 


Table  of  loss  (0  .a^ ) 


Acti^n^aj^ 

State  of  nature" 

1 

2 

3 

9r9o 9 

Rw 

0 

*1 

*1H3 

9i"e0  6 

R', 

0 

0 

4 4 

Uj  > 0 , i = 1 , . . . , 4 ) 

ei"90  9 

R 

c 

*2 

0 

l2 

(2.2.2) 

ereo 9 

RI 

l2 

*4 

0 

0 

er9o 9 

rb 

*1H3 

*1 

0 
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This  loss  function  reduces  to  the  one  considered  in  Bhattacharyya 
(1958)  and  in  Seeger  (1972)  if  ^ = *2  = ^4  = Ai+)l3  = ^ • Note  that 
the  above  loss  function  assumes  the  indifference  zones  in  the  sense 
that  we  do  not  distinguish  between  the  actions  1 and  2 (2  and  3)  when 
e.-en  £ R.  (Q--0n  € Rt  , respectively).  It  should  be  pointed  out 

lU  1-j  1U  I ^ 

that  Bhattacharyya  (1958)  considered  this  loss  function  to  avoid 
an  irregularity  of  a similar  loss  function  without  indifference 
zones.  In  fact,  Bhattacharyya  (1956)  derived  an  admissible  minimax 
decision  rule  assuming  a simple  0-1  type  loss  function  when  is 
assumed  known  and  e^,...,e^  are  unknown  means  of  normal  distributions. 
However,  the  irregularity  of  such  a loss  function  has  been  pointed 
out  in  the  sense  that  the  minimax  risk  does  not  tend  to  zero  even 
if  the  sample  sizes  increase  indefinitely,  and  the  same  problem  has 
been  studied  afresh  by  Bhattacharyya  (1958)  with  a loss  function 
of  the  type  given  in  (2.2.2). 

For  given  x = (xQ,x^ ,. . . .x^)  consider  decision  rules  of  the  form 


6(x)  = (61(x),...,6k(x))  (2.2.3) 

where  6 i (x ) = (6.(1 1 x) , S.(2|x),  6i (3| x) ) and,  for  j = l,2,3,  6 i ( j | x ) 

denotes  the  conditional  probability  of  taking  action  j in  the  i-th 

component  problem.  Note  that  there  is  no  loss  of  generality  in 

considering  decision  rules  of  the  form  given  in  (2.2.3).  The  risk 

k 

function  of  a rule  6 for  fixed  e is  then  R(e  ,6 ) = l R.(e,6.)  where 

i=l  1 ' 1 

R,(e,6.)  = E_[L.(e,5.(x))].  For  a prior  distribution  x(e)  of  e,  the 
1-iei-i-  k- 

overall  risk  of  a rule  6 wrt  t is  denoted  by  r(x,6)  = J r.(x,6..) 
where  r.(x,6.)  = /R. (o ,6 . )dx(o ) . 


■p 
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Suppose  that  partial  prior  information  is  available  to  a decision 
maker  such  that,  for  each  i,  he  can  specify  y.  = e RyURg]  and 

Y ^ = P[e.j-e0  6 R ] where  _<  1 for  i = l,...,k.  Let  r denote 

the  class  of  all  such  prior  distributions,  i.e., 

r = Me):  / dx(e)=y.,  / dx(e  )=Yi  for  (2.2.4) 

ei‘eOeRWURB  ei"eOeRc  . . , 

1 I j«  * • 

Now  our  goal  is  to  find  a r-minimax  rule  6*  as  defined  below. 

Definition  2.2.1.  A rule  6*  is  a r-minimax  decision  rule  if 

sup  r(x,<5*)  = inf  sup  r(t,6), 
rer  6 ter 

and  sup  r(t,6*)  is  called  the  r-minimax  value. 
x€r 

Many  authors  have  found  r -mi hi max  rules  by  finding  Bayes  rules 
with  respect  to  'least  favorable1  priors  in  the  class  r.  However, 
such  a method  was  found  not  to  lead  to  the  solution  of  our  problem; 
the  following  results  analogous  to  standard  results  on  minimaxity 
are  found  useful . 

Lemma  2.2.1.  Suppose  {xn,  n = 1,2,...}  is  a sequence  of  priors  in  r. 

If  Tun  inf  r(r  , 6 ) c and  if  sup  r(t,6*l  £ c,  then  6*  is  a r-minimax 
n 6 x€r 

decision  rule  and  c is  the  r-minimax  value. 

Proof.  The  result  is  an  immediate  consequence  of  the  following 
inequalities. 
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sup  inf  r(x,6)  >_  lim  inf  r(x  ,6) 
x€r  6 n 6 n 

> c 


>_  sup  r(x,6*) 
x£r 

>_  inf  sup  r(x,6) 

6 x€r 

_>  sup  inf  r(x,s) . 
xtr  6 

The  next  result  is  useful  when  we  have  certain  invariance  under  a 
finite  group.  Following  Ferguson  (1967),  let  G = (g-j  • ,gN),  G 
and  G denote  the  group  of  transformations  on  the  sample  spaced, 
the  induced  group  on  the  parameter  space  and  that  on  the  action 
space,  respectively. 

Lemma  2.2.2.  Suppose  that  a given  decision  problem  is  invariant 
under  a finite  group.  If  gx  6 r for  any  x 6 r and  g fc  G where 
gx(B)  = x(g"^(B)),  then  a r-minimax  rule  within  the  class  of  invariant 
behaviorial  decision  rules  is  r-minimax. 

Proof.  Let  6*  denote  the  invariant  rule  generated  from  a given 
I 1 N 

rule  6,  i.e.,  6 (x,A)  = ^ £ 6(g.(x),  g - (A) ) for  each  x fc^and 

" j=l  J 3 

A c u.  Then  clearly  6*  is  invariant,  and 

I , N g . g . 

sup  r(x,6  ) = sup  j?  l r(x ,6  J)  where  6 J(x,A)  = 6(g . (x)  ,g . (A) ) 
x€r  x€r  IN  j=l  J J 


1 

= sup  rr  l r(g.x,6) 
xtr  j=l  J 


1 

<n  I sup  r(g.x,6) 
j=l  xtr  J 


--V- 
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1 

<_  tj-  l sup  r(x,6)  since  gx  € r for  any  x £ r and  g t 6 
N j=l  xfer 

= sup  r(x,6). 
xfcr 

This  completes  the  proof. 

It  should  be  pointed  out  that  our  decision  problem  is  invariant 
under  the  symmetry  about  the  origin,  and  that  randomized  decision  rules 
and  behavioral  rules  are  equivalent  since  the  distributions  of  X's  are 
assumed  to  be  continuous. 

2.3  Known  control 

In  this  section  is  assumed  known  and  thus  we  may  assume  = 0 
without  loss  of  generality.  Hence  x and  e in  this  section  denote 
(x1,...,xk)  and  (e1,...,3k),  respectively. 

Lemma  2.3.1.  Suppose  that  a decision  rule  6(x)  of  the  form  in  (2.2.3) 
is  determined  by  6 i ( 1 1 x ) = I^  _d  ) (x^ ) » 6 i ( 2 1 x ) = I[_d  jC|  ](xi ) and 

6 - ( 3 1 x ) = I,.  \ ( x . ) for  d.  > 0 and  i = l,...,k.  Then,  for  i = l k, 

l 1 - (d^ ,°°;  i i — 

sup  r..  (x,6 . ) £ r.  (2.3.1 ) 

x6r  ' 

where,  for  i = 1 , . . . ,k, 

oo 

ri  = / [«.3Yii:i(x+A2)+!!.2Yj(fi(x-A1)+fl.(x+A1))+J!.4(l-Yi-Yi) 

1 f.(x+A,)]dx 

di 

+ / t1Yifi(x-A2)dx. 

— oo 

Proof.  It  follows  from  (2.2.2)  that,  for  0^  <_  -a2> 


- 


R,(e,6j  = *,  J f,(x-0.)dx  + t,  / f.(x-6.)dx 

1 -d  1 1 J d 1 1 

i i 


1 / fj ( x+Ap )dx+i-3  / fi(x+A2)dx 

-di  di 


= &1  / fi (x-A2)dx+il3  / f..(x+A2)dx  by  the  symmetry  of  f^(*) 

i 

In  a similar  manner  it  can  be  shown  that 


) £ &i  / fi (x-A2)dx+t3  / fi(x+A2)dx  for  0..  > a2 , 
«,(«.«,)  { d> 


< nd  ! f.(x+A,)dx 
’d.  1 1 

i 


for  A-|  < | ei  |<a2> 


If  -A,  <_  e . < A,,  then  R.  (0  ,6  . ) = aJj  f . (x-0  . )d  x+f  f . (x-@ . )dx] , 

1 1 1 i-i  i.  i i j 1 1 

therefore,  R}(0,6.)  = ^[f . (d1--0i  )-f . (-d..-0  . )] 

f.(0.-d.) 

= i2  f i (e -+di )[f _ (e%dJ j -I], 

where  R1  denotes  the  derivative  of  Ri  w.r.t.  0^ 

It  follows  from  the  MLR  property  of  f ^ (x-0 . ) that  Rl(0,6..)  has  at  most 
one  change  of  sign,  from  negative  to  positive  if  there  is  any  sign 
change  at  all;  therefore,  R..(0,6.)  attains  it  supremum  over 
0..  € [- A^ , A-j  ] at  either  0^  = -a-j  or  0.  s A|.  Hence  it  follows  from 
the  symmetry  of  f - ( - ) that,  for  € [-A-|,Aj], 

00 

Ri (®  »6 i ) £ l2  / i (x_Ai )+f i (X+Ai ) 3dx - 

It  follows  from  (2.2.4)  that  sup  r.(i,6.)  _<  r.. 

x6r  1 1 1 
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Now  we  derive  the  r-minimax  rule  in  the  following  result. 


Theorem  2.?. 1 . Assume  that  independent  random  variables 
have  strong  unimodal  symmetric  p.d.f's  f-j  (x^-e ^ ) , . . . , 

respectively,  and  that  the  loss  function  is  given  by  (2.2.1)  and 
(2.2.2).  Then  the  r-minimax  rule  6*  is  given  by  6*0  |x)  = 

■ I[-d,.,di](Va"d  n<3u>  * I(di,-)(xi> for 

i ■ i i 

i = l,...,k  where  each  di  is  determined  by  di  = c|  = max(ci,0)  with 
c^  being  defined  by 


i3Yif  i (X+A2)+Vi  ^fi  ^X"A1  )+fi  (X+Ai ) J+^O-YiHrj  )fi  (**A] ) 
> ^Ti^(x-A2)  as  x >_,  < cr 


(2.3.2) 


Proof.  We  first  show  that  the  decision  rule  6*  is  well  defined  by 
verifying  the  existence  of  a c^  satisfying  (2.3.2).  Note  that  the 
difference  between  both  sides  in  (2.3.2)  can  be  written  as 


!l3Yifi  (x+A2)  + t2Y  ■ (fi  (x-A1  ) + f.  (x+A1  ))+t4(l-Yi-Y-  (x+A1  (*'A2) 

fd(x+A?)  f j (x-A,  ) f j (x+A-j  ) 

= T,;(x-A2)Lt3Yi  f (x'-aJ  + 42Yi(f,(x-A9)  + f,(x-A,)^n4^"Yi"YP 


fi(x+A1) 

f^X-Aj)  '*lYi]* 

Then  the  existence  of  such  a c..  follows  from  the  MLR  property  of 
f-j(x-e^).  Consider  a sequence  of  prior  distributions  { t p } of  6 
under  which  0^,...,ek  are  independent,  P(e..=A2)  = P(ei  = -a2)  = y^/2, 
P ( o i =Ai ) = P(ei  = -a-| ) = y^/2  and  P(e.  = -A^n'1)  = P(ei=A1+n_1)  = 
0~Y-j-Y])/2.  Then  clearly  Tp  6 r for  n > (a2-a^)_1,  and  the  overall 


Furthermore,  it  can  be  easily  shown  that,  for  any  A > 0,  f. (x-a)- 

f .j  (x+a  ) > 0,  < 0 as  x > 0,  < 0 and  therefore  p(l,x)  <_  p(3,x)  as 

x > 0,  < 0.  It  follows  from  (2.3.3)  that 

Ooo 

1 

lim  inf  r.(x  ,6.)  = j / min  p(j,x)dx  + -o  / min  p(j,x)dx 

n «i  1 " 1 c — l<j<2  c 0 2<j<3 


= / min  p(j,x)dx 
0 2<j<3 


= /min[£3Y1f.(x+A2)+t4(l-Yi-y:)fi(x+A1)+  (2.3.4) 

^Y^ffU+AjJ+fjU-A.,)),  «.1yifi(^-A2)]dx 
0 

+ / AiYifi(x-A2)dx. 


The  second  equality  in  the  above  follows  from  the  fact  that  p(l,x)  = 
p(3,-x)  and  p(2,x)  = p(2,-x).  It  .allows  from  (2.3.2)  and  (2.3.4)  that 


lim  ini  r,(,  ,6  ) 
n 6 . 

l 


/ (x+A2)+£4(1-y.-y{  )fi  (x+41  )+£2y] 


(fi(x+A1 )+fi (x-A^ ) )]dx 
di 

+ / t1Y1-fi(x-A2)dx 

— 00 

>_  sup  r- (t,6^)  by  Lemma  2.3.1. 
x€r 

Therefore, 


k 

lim  inf  r(x  ,6)  = lim  \ inf  r.(x  ,6.) 
n 6 n n i=l  6i  1 n 1 

k 

> l sup  r . (t ,6*)  (2.3.5) 

i=l  x€r  1 


> 


sup  r(x,6*). 
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Here,  the  first  equality  follows  from  the  fact  that  the  Bayes  rule 
consists  of  the  Bayes  rules  for  the  component  problems.  Lemma  2.2.1 
then  yields  that  the  decision  rule  6*  is  r-minimax  among  all  decision 
rules. 

We  can  derive  a minimax  rule  using  the  proof  of  the  above  result. 

For  this  purpose,  assume  that  = ^ = = 1 and  ^3  = ^ < 1 • Let 

M 

us  consider  a rule  6 of  the  type  given  in  Lemma  2.3.1  where  each  d. 

x 

is  determined  so  that,  for  F.(x)  = / f.(t)dt, 


Fi  (di~A2)  + «-Fi  (-di -a2)  = Fi  ('di*Ai  )+Fi  ( -di  +a-j  ). 


(2.3.6) 


Note  that  the  existence  of  such  a positive  d.  follows  from  strong 
unimodality  and  the  symmetry  of  f^(-)-  Let  us  define  and  y.!  = l-y,. 
for  each  i = 1 ,. . . ,k  by 

Yi  = [fi  (di-A-,  )+fi  (di+A1  )]/[f  - (di-A2)-Jifi  (di+A2)+fi  (d--A1  )+fi  (di+A1 )]. 

Since  y..  € [0,1],  we  can  consider  a family  of  priors,  r given  in 
(2.2.4).  Then  it  follows  from  Theorem  2.3.1  that  the  corresponding 
r-minimax  rule  is  of  the  same  type  as  6 except  that  now  d^  = max(c..,0) 
where  c..  is  determined  so  that 

H(C.)  = Yi[j.fi(ci+A2)-f.(ci-A2)]+yl(fi(c.-A1)+fi(c.+A1))  = 0. 

|| 

Since  H(d^)  = 0 and  d^  > 0,  dt  = d^ , i.e.,  the  rule  6 is  the  r-minimax 
procedure  ; therefore  it  follows  from  (2.3.5)  and  (2.3.1)  that,  for 
some  sequence  of  priors. 
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lim  inf  r(x  ,6)  >_  l Y [F.(d,-A?)  + JlF.  (-d.-A?)]+Yl[F.  (-d.-A. )+ 

n 6 n i=l  1 1 1 6 1 1 ^ 

F.(-d.+A,)] 

k ill 

= l [F • (-d.-A, )+F . (-d .+A, )]  by  (2.3.6) 
i=l  1 11  1 11 


= l sup  Me ,«  ) 
i=l  e 1 ' 


sup  R(0 ,6  ). 

0 

The  last  equality  in  the  above  follows  from  similar  arguments  as  in 
the  proof  of  Lemma  2.3.1.  Therefore,  we  have  the  next  ge-~.'al  result 
which  includes  Bhattacharyya' s (1958)  result  as  a special  case. 

Corollary  2.3.1.  Under  the  assumptions  in  Theorem  2.3.1,  if  *"|=)l2=il4=1 
and  ji3=ji  < 1,  then  a minimax  decision  rule  6 is  given  by  6^(1  |x)  = 

I(-,-di)(xi)* 2 * * *  6i(21^  = I[-di,dil(”i)  and  = I(di,co)^xi)  where 

each  d^  is  determined  by  (2.3.6). 

The  following  example  illustrates  the  application  of  these  results. 

2 

Example  2.3.1.  Suppose  ^ represents  normal  population  N(ei>o  ) for 

2 

i = 0,1, ...,k  with  0Q  and  a known.  We  assume  that  a random  sample 

of  size  ni  is  taken  from  each  of  the  k populations  ^ ,w2,  • . . By 

sufficiency  we  can  restrict  our  attention  to  the  decision  rules 

depending  only  on  the  sample  means  x^,...,x^  where  x^  has  normal  distri- 

2 2 

bution  with  mean  0^  and  variance  = a /n^  for  i = l,...,k. 

(A)  r-minimax  rule:  Application  of  Theorem  2.3.1  yields  the  r-minimax 
decision  rule  6*  of  the  type  in  Theorem  2.3.1  with  X..-0Q  replacing 
xi  and  d.  being  a^maxtt^.O)  where  t..  is  determined  by 
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V 


-2(X .+€.  )t-j  -26.  (t.-X • ) -2x.(t.-6.) 

.e  1 1 1 +fi'2Yi  [e  +e  ]+*4(l-YrY; ) 


.-2MW 


(2.3.7) 


- Vi  = 0 

where  X.  + ^ anc*  xi"^i  = Al^°i  for  * = 

(B)  Minimax  rule:  We  assume  s.1  = SL2  = ^4  = ^ and  l3  = 1 < ^ ‘ Then  a 
minimax  decision  rule  6M  is  of  the  type  in  Corollary  2.3.1  with  x^-Qg 
replacing  x^  and  d^  being  where  t^  is  determined  by 


*(t.-X.-€i)+M(-t.-X.-€.)  = $(-t.-X.+6.)  + $(-t.+Xi-€.) 


(2.3.8) 


with  x..  and  defined  as  in  (A). 

2.4  Unknown  control 

In  this  section  we  will  consider  the  case  when  eQ  is  unknown  and 

will  derive  a r-minimax  decision  rule  6*  in  the  class  of  decision 

u 

rules  6 = ( 6 i , ...,6^)  for  which  6^  depends  only  on  Xg  and  . 

We  state  the  following  well-known  lemma  of  Ibragimov  (1956). 

I 

Lemma  2.4.1.  The  convolution  of  two  strongly  unimodal  probability 
density  functions  is  also  strongly  unimodal. 

It  follows  that  the  pdf  of  Yi  = X^Xg  given  by 

co 

9i (y-(ei-e0))  = / f^x+y-e^f  (x-e  )dx,  (2.4.1) 

— GO 

is  strongly  unimodal  and  symmetric  about  the  origin.  The  next  result 
follows  from  this  fact  and  Lemma  2.3.1. 
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Lemma  2.4.2.  Suppose  that  a decision  rule  6(x)  is  defined  by 

Mil?)  = I(-»,-di)(xi_x0)’  6i(2|*}  = I[-d.,di](xi'x0)  and  6i(3lx)  = 

1 (d • ,'*■) (xi”X0}  f0r  di  - °’  1 = 1-***k* 

Then,  for  i = 1 . ,k, 

sup  r.(t,6.)  £ r.  (2.4.2) 

xfcr  1 1 

where,  for  i = 1 ,. . . ,k, 

di 

+ / ^1Yigi(y-A2)dy. 

-00 

We  now  proceed  as  in  Theorem  2.3.1  by  considering  the  following 
sequence  {Tn,  n = 1,2,...}  of  prior  distributions. 

Under  xn>  (i)  0 -8g,. . . ,0^-Gg  are  independent, 

(ii)  P[e.-e0=A2J  = Yi/2  = P[ei-e0  = -a2J, 

P[ei-0o=A1]  = y-/2  = 

p[VW  ^ ' ('-'fr1,l)/2  ■ pt9r6o = -V  and 

(iii)  0g  has  uniform  distribution  over  [-n,n]  and  is 
independent  of  e^-e^,. . . ,0^-0g. 

It  can  be  easily  shown  that  the  Bayes  rule  in  A>g  w.r.t.  xn  is  determined 
by  Bayes  rules  for  component  problems  and  that  the  Bayes  rule  for  the 
i-th  component  problem  depends  on  x only  through  and  Xg.  It  follows 
from  simple  computation  of  the  posterior  risk  of  each  possible  action 


that  the  overall  risk  of  the  Bayes  rule  for  the  i-th  component 
problem  is  given  by 
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n(v+l ) 

+ / f.(z-w-A?)f  (z+w)dz  and 

1 3 1 n(v-l)  1 2 0 


n( v+1 ) 


h (2,v,w)  = 4,y.  / [f .(z-w+A?)+f.(z-w-A?)]fn(z+w)dz. 

n 1 1 n(v-l)  1 21  20 


Furthermore  it  is  easy  to  see  that,  for  any  (v,w)  6(-l,l)xR,  hn(v,w) 


converges  to  h(w)  = min{h(l,w),  h(2,w),h(l ,-w)}  where 


h(l,w)  = i2y‘.  / [f  i (z-w-a-j  )+fi  (z-w+A-j  )]fQ(z+w)dz  + 


+ a40-y1-y  \)I  fi(z-w-A1)fQ(z+w)dz  + 


+ (b+VYi  / fi(z-w-A2)fQ(z+w)dz  and 


h(2,w)  = £-j Y-j  / [f.(z-w+A2)+f1.(z-w-A2)]f0(z+w)dz. 


Since  hn(v,w)  is  bounded  above  by  h(2,w)  which  is  integrable,  it 


follows  from  (2.4.3)  and  the  Lebesgue  convergence  theorem  that 


iim  inf  r.(x  ,6.)  >_  / h(w)dw 
n 6€A>0  1 n 1 


(2.4.4) 


= j / min{p(l ,y)  ,p(2,y)  ,p(l  ,-y) }dy 


Where  p(l,y)  = ^2v ] [gi (y-4] )+g. (y+A] )]+t4(l-Yi-Y- )gi (y-A1 ) + 


( ^i+^3)Yigi (y-A2)  and 


p(2,y)  = n1Yi[gi(y+A2)+g.(y-A2)]. 


The  identity  in  (2.4.4)  follows  from  (2.4.1)  and  a change  of 


J 
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! 


variable.  Since  p ( 1 ,y ) .>,  £ p(l,-y)  as  y > 0,  < 0,  (2.4.4)  yields 
that 


lim  inf  ^(-^,6..)  >_  /min[J.3Yigi (y+A2)+{.4(l-Yi-Y^  )gi  (y+A1 ) + 
n <5€j6q  0 

a2Yi  (gi  (y+Al  )+9i  (y"Al  ^ ) • Vi9i  (y-A2^dy  + 

(2.4.5) 

0 

+ / i1Yi9i(y-A2)dy. 

— oo 

Now  consider  a decision  rule  6*  defined  by  6t(l|x)  = _d . ) (x  j'x(})  > 

6t(2|x)  = I[.d  d ^(Xi-XQ)  and  «*(3|x)  = I(d.  ,co)(xi-x0)  for  1 = 
where  each  di  is  determined  by  di  = max(ci,0)  so  that 


^3Yi9i  (y+A2)+*2Y  • (9i  (y-A!  )+9-j  (y+A! ) )+^4( 1 -Vj  -Y-  )9i  (y+A-, ) 

< , 1 t1Yi9i(y-A2)  as  y >,  < c.. 

From  Lemma  2.4.1,  Lemma  2.4.2,  (2.4.5)  and  (2.4.6)  it  follows  that 


(2.4.6) 


lim  inf  r.(x  ,<$.)  >_  sup  r.(x,6t), 
n 6fc£Q  1 n 1 xtr 

and  this  in  turn  implies  that 

k 

lim  inf  r(x  ,6)  = lim  £ inf  r.(x  ,5-) 
n n 1=^ 


— inf  r,  (Tn,l5i  ) 
n 6fcfcg 


k 

i l sup  r • (x  ,6*) 
i=l  xtr 


>_  sup  r(x,6*). 
xtr 


L 


J 
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Lemma  2.2.1  then  yields  the  next  result. 

TheoremJ?_._4.J . Assume  that  independent  random  variables  Xq,...,X^ 
have  strongly  unimodal  symmetric  pdf's  ^(Xq-Pq) » • • • .f^x^-e^) * 
respectively,  and  that  the  loss  function  is  given  by  (2.2.1)  and 
(2.2.2).  Then  the  r-minimax  decision  rule  6*  i n jS» ^ is  given  by 

5*(1l5>  ‘ '(-..-djHXj-x,,)-  6*(2l5> = di](,Irxo> and 

6*(3|x)  = I^d  co)(x-j-xo)  for  i = where  di  = max(ci>0) 

with  c.j  being  determined  by  (2.4.6). 

Remark  2.4.1.  Note  that  the  symmetry  of  f. (.)  can  be  replaced  by 
the  symmetry  of  g^(.)  for  Theorem  2.4.1  to  hold.  It  can  be  easily 
shown  that  the  relation  (2.4.4)  holds  without  the  assumption  of  the 
symmetry  of  f.(*).  Then  all  the  steps  after  (2.4.4)  still  remain  true 
provided  g^-)  is  symmetric.  It  ji.ould  also  be  pointed  out  that  the 
symmetry  of  g^(-)  follows  when  fg(-)»  fi (•)»••• ,fk( • ) are  identical. 

The  next  result  follows  in  exactly  the  same  manner  as  Corollary 
2.3.1  was  proved. . 

Corollary  2. 4. 1 . Under  the  assumptions  in  Theorem  2.4.1,  if 

M 

f]  = *2  = *4  = 1 anc*  £3  = *■  < 1 > then  a minimax  decision  rule  6 
inu0  is  given  by  6^(1  |x)  = )(xrx0)»  <5^(2|x)  = 

|y| 

( x ^ — x g ) and  <s"(3|x)  = oo)(xi_x0^  for  1 = Where  d i > 0 

1 x 

is  determined  so  that,  for  Gi  (x)  = / gl-(t)dt, 

— 00 

Gi(d.-A2)+AGi(-d.-A2)  = G.H.-A-J+G.f-d.+A^. 


We  illustrate  the  application  of  the  above  results  by  the  following 


examples. 

Example  2.4.1.  Consider  the  same  problem  in  Example  2.3.1  except 

that  6g  is  unknown.  In  this  case  a random  sample  of  size  n^  is  also 

taken  from  the  control  u^.  Let  Xg,X^ , . . . ,X^  denote  the  sample  means 

2 2 

corresponding  to  ttq,.  . . ,1^,  respectively.  Here,  (Xq  = a /nQ. 

(A)  r-minimax  rule:  The  r-minimax  rule  6*  in  given  in  Theorem 

2 2 i 

2.4.1  is  determined  by  d^  = (o^+Oq)^  max(ti- ,0)  where  t^  satisfies 
(2.3.7)  with  A^-+  fcj  = A2(o?+Oq)  s and  = a^(o?+Oq) 

(B)  Minimax  rule:  Assume  = ^2  = ^-4  = ^ anc*  £3  = t < 1 • Then  the 

M 

minimax  rule  6 in  jsjq  given  in  Corollary  2.4.1  is  determined  by 

9 9 1 

dj  = with  t.j  satisfying  (2.3.8)  where  xi  and  are 

defined  as  in  the  above.  This  offers  a partial  solution  of  the 
problem  in  Section  5.1  of  Bhattacharyya  (1956). 

2 

Example  2.4.2.  Assume  that  t^.  represents  normal  population  N(0,o^) 

2 

for  i = 0,1,..., k with  unknown,  and  that  we  have  a random  sample 

of  size  n taken  from  each  population  t^.  . Consider  a problem  of 

partitioning  the  treatment  populations  in  terms  of  variances 
2 2 

oj,...,ok  with  a loss  structure  analogous  to  that  given  by 

(2.2.1)  and  (2.2.2),  i.e.  a loss  function  obtained  from  the  latter 

2 

by  substituting  log  c^,  log  A-|  and  log  A2  for  0^  a-j  and 

respectively.  Thus  A-j  and  a2  here  are  assumed  such  that  1 < a-j  < a2- 

By  sufficiency  we  need  to  consider  only  the  decision  rules  depending 
2 2 2 

on  S-|,...,Sk  where  S.-  denotes  the  sample  variance  corresponding  to 


2 2 

it..  Since  nS^/o^i  = 0,1, ...,k)  are  independently  distributed  chi- 
square  random  variables  with  degrees  of  freedom  n,  it  can  be  easily 
seen  that  the  associated  location  parameter  problem  satisfies  the 
assumptions  in  Theorem  2.4.1  except  the  symmetry  which  are  not  necessary 
in  this  problem  because  of  Remark  2.4.1.  Therefore,  with  obvious 
modifications  we  have  the  following  results. 

Let  denote  the  class  of  decision  rules  6 = (6^, ...,6^)  for  which 

2 2 2 2 

6^  depends  only  on  Sq  and  s^ , and  let  T^  denote  s^/Sg  for  i = l,...,k. 


(A)  r-minimax  rule:  A r-minimax  rule  6*  in  £g  is  given  b 

6*0|T.)  = I , (T.),  «*(2|T1)  - I , (T.)  and  «*(3|T  ) - 

1 1 (O.dT1)  1 1 1 [dT1 ,di]  1 1 1 

I(d  ^(T-j)  for  i = l,...,k.  With  each  d^  being  determined  by 

di  = max(ci ,1 ) so  that 


A2+y  n A?+y  n n A1  J A?+y  n A1 

+<T%)  ^^So-Ti-TiXT^)1 ’(^) 


(2.4.7) 


1.  _>  ^ 


as 


y >,  < ci 


(B)  Minimax  rule:  Assume  = l2  = £4  = 1 and  £3  = a < 1.  Then  the 
minimax  rule  6^  in  A.g  is  the  same  as  6*  in  (A)  except  that  d^  = d 
for  i = l,...,k  where  d is  determined  so  that 


(2.4.8)  Gn(d/A2)+*[l-Gn(dA2)]  = Gn(A1/d)+l-Gn(dA1) 

where  Gn  denotes  the  distribution  function  of  F-distribution  with 
degrees  of  freedom  n and  n. 

2 2 

We  note  that  if  7^  represents  N(p..  ,0^)  with  both  p . and  a.,-  unknown, 

2 

then  the  above  results  still  hold  with  n-1  replacing  n where  s^  in 
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such  a case  is  the  best  unbiased  estimator  of 


2.5  Comparison  of  r-minimax  rules  with  Bayes  rules 

When  we  represent  our  a priori  information  about  the  parameters 
by  prior  distributions  over  the  parameter  space,  one  method  for  the 
use  of  such  information  is  to  find  a rule  which  is  r-minimax  with 
respect  to  the  class  of  such  priors.  Another  way  is  to  select  one  of 
such  priors  and  use  the  corresponding  Bayes  rule.  Thus  Bayes  rules 
wrt  prior  distributions  in  r are  natural  competitors  for  a r-minimax 
rule. 

2 

In  this  section  we  consider  k+1  normal  populations  N(e^,a  ) with 

2 

a known,  and  will  derive  Bayes  rules  wrt  a normal  prior  and  then 
we  compare  them  with  the  corresponding  r-minimax  rules  from  both 
points  of  view.  Assume  that  (e0,...,ek)  have  prior  distribution  iQ 
under  which 

(i)  e,,, ...  ,e . are  independent  and 

0 k (2.5.1) 

(ii)  each  0^  has  a (marginal)  normal  distribution  with  mean  and 
2 

variance  v^ . 

Let  Xg,...,xk  denote  the  sample  means  based  on  samples  of  size  n. 

(i  = 0,1, ...,k).  To  simplify  forthcoming  formulae,  let  us  introduce 
the  following  notations. 

o\  = a2/nr  b.  = [(o:2+v:2)'1  + (a‘2+Vg2)'1?, 


\ i 


M 
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V6i  = 


W 


V6i  = 


Vbi 


) 

(2.5.2) 


mi  = (oi2xi+vT2y.)(aT2+v:2)"1,  y.  = (nyi^)/^. 


The  following  theorem  describes  the  Bayes  rule. 


Theorem  2.5.1.  Assume  the  prior  tq  as  specified  in  (2.5.1).  Then 
B B 

the  Bayes  rule  6 wrt  x is  given  by  6.j(l|y)  = ^ j(y.), 

<5^(2 |y)  = IQ_d . ,d -](yi ) and  6i(3ly)  = I(d.,-)(yi^  for  1 = 1»*-**k 
where  each  d^  = max(c^,0)  is  determined  so  that 

V('Wy)  + *4W'Wy)  - *(-vvy)] 

+ ^ A n- - -y )-■*>(- ^ -y ) ] - ^^(-X^-e.+y)  (2.5.3) 

> 0,  < 0 as  y ^ ci , y ^ ci 

with  * denoting  the  distribution  function  of  N(0,1)  distribution. 

Proof.  It  suffices  to  find  the  Bayes  rule  for  each  of  the  k component 
problems.  This  reduces  to  comparison  of  posterior  risks  of  three 
possible  actions.  Let  p^y^,  p,,(y..)  and  P3(y.j)  denote  the  posterior 
risks  of  action  1,2  and  3,  respectively,  in  the  i-th  component  problem, 
then  it  can  be  shown  that 

P^y)  = (s-1+2.3)<l>(-Xi-€i+y )+fi.4[<X>(Ai+€i-y ) - $(xi-€i-y)]+ 

+ *2[*(x1-€i-y)  - *(-xi+fei-y)], 

P2(y)  = 2.1[,J>(-Xi-€i-y)  + <t>(-X.-fc.+y)]  and 
P3(y)  = p^-y)- 


71 


Note  that  p-j  (y J-p^Cy)  can  be  written  as  EyH(z)  where  Z has  normal 
distribution  with  mean  y and  variance  1 and  H(-)  is  given  by 


V*3  ifziXi+<M 


if  x .-fc.  < z < x .+£. 

ii  ii 


H(z)  = { 0 


if  -Ve(  ifii,-!. 


if  -X .-6.  < z < -X .+£. 
11  11 


l-(V*3)  ifzi-xrV 


Since  the  density  of  normal  distribution  N(y,l)  has  MLR  proi  rty, 
it  follows  that  p-j  (y J-p^ty)  has  at  most  one  change  of  sig'  rthermore, 
it  can  be  shown  that  p-j  (y )-p3(y ) is  increasing  on  nd 

P1(0)-P3(0)  = 0.  Thus  p1(y)-p3(y)  > 0,  < Oas  y > 0,  y < 0. 

Similarly,  we  can  show  that  P3(y)-P2(y)  i 0,  < 0 as  y < c.,  y > c.  for 
some  real  number  c^  unless  p^yj-p^y)  1 0 for  all  y.  Therefore  the 
result  follows. 

Thus  comparison  can  be  made  between  the  r-minimax  rule  6*  given 

g 

in  Example  2.4.1  and  the  Bayes  rule  6 given  in  Theorem  2.5.1  under 

the  relations  y.  =<i>[(-A2+pi-u0)(v?+VQ)"li]+i'[(-A2-ui+y0)(v?+VQ)';i]  and 

Y-  = *[(A1-Mi+u0)(v?+VQ)"^]-4>[(-A1-u.+M0)(v?+VQ)'2].  We  compare 

these  rules  under  the  assumption  that  ^ = *2  = *4  = 1 » ^ = * » = n 

2 2 

and  v.j  = Vq  for  i = l,...,k.  There  are  two  ways  of  any  meaningful 
comparison  of  these  rules.  One  way  is  to  examine  the  increase  in 
overall  risk  wrt  tq  resulting  from  the  use  of  the  r-minimax  rule. 

Another  way  is  to  compare  these  rules  in  terms  of  sup  r(i,6).  When 

x€r 
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T* 


\ 


I 


i 


1 


2 2 

n.  = n and  v.  = vn,  the  Bayes  rule  depends  on  x only  through 
1 1 u 

B * B 

x-.-Xp.,. . . ,x.  -x„  and  it  can  be  shown  that  sup  r(t,6  )=  7 sup  r.(i,6^). 
1 u K u Ter  i=l  Tfcr 

Thus  it  suffices  to  compare  these  rules  with  respect  to  classification 
of  one  population.  We  choose  it-j  for  this  purpose  without  loss  of 
generality. 

Now  we  introduce  the  parameters  used  in  the  comparison  as 
follows. 


e 


vo  nvo 


1 2 

°0  0 


2’  62 


a2+a1 
2 H 


a2‘ai  urMo 

-•  «3  ■ -S4  a"d  h- 


^ 1 


The  overall  risk  wrt  tq  of  these  rules  can  be  written  as 


m(-A-B-C)  + <f  (A-B-C)  + «(D-E)  + zt(-D-E) 

- $n(-A-B-C ,-D-E ;P ) + (1-t)*  ( -A-B-C,  D-E;P) 

- 4>q(-A+B-C ,D-E :p ) + $q(A-B-C,-D-E;p) 

- 4>q(-A+B-C  ,-D-E;p  )-4>q( A-B-C  ,D-E  ;P ) 

+ 4>q(-A+B-C  ,D-E  ;p  )-<i>0  (A-B-C  ,-D-E  ;P ) 

- <t0(A+B  -C,D-E;P)+(l-O*0(A+B-C,-D-E;P) 

where  <j >q(-,-;p)  is  the  cdf  of  a bivariate  normal  distribution  with 
zero  means,  unit  variances  and  correlation  coefficient  P,  A=&2>  B^, 

C = p = 3^ ( 1 +3-| ) for  both  6 and  6*,  D = d-j^”2  for  6 , 

D = max(t^ ,0) (1+31 )"2  for  6*,  E = p 6^  for  6 and  E = Pe^  for  6*  with 
d-j  and  t.|  being  those  in  Theorem  2.5.1  and  in  Example  2.4.1.  Also 

sup  r,(x,6,)  for  both  rules  can  be  written  as 

xer 
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YlL*(R+|S|-T-U)-*(-R+|S|-T+U)+t*(-R+|S|-T-U)] 

+ [4>(-R-S+T-U  )+4>(-R+S-T+U  )]V[4>(-R+S+T-U ) ( - R- S-T+U ) ] 

+ (l-Yp*(-R+|S|-T+U) 

v -1-  1 B 

where  x y = max(x,y),  T = U = ^3^1  f°r  both  6 anc*  <s*» 

S=  for  6B,  S = 0 for  6*.  R = 3^ ( 1+3-j  )'adj  for  6B,  R = max(tj,0) 

with  d-j  and  t-j  being  those  in  Theorem  2.5.1  and  in  Example  2.4.1. 

Note  that  y-| , a-|  and  6-j  in  Theorem  2.5.1  can  be  written  as 

/0i  xi_xn  04  

y,  = + , A-i  = Bo/l+0i  and  £n  = B-Vl+8-i.  Thus  the 

1 /T+b^  /2  o//n  vT+£^  12  1 13  1 

constant  d-j  does  not  change  for  the  different  values  of  64  when  b1  ,b2 

and  S3  are  fixed.  Particularly,  if  1 = 1,  then  d-j  is  easily  found  to 

be  s2/l+0i  which  does  not  vary  for  different  values  of  s3-  Table  IV 

B 

and  Table  V give  ^-|(tq»6,)  and  sup  r^(x,6^)  for  61  = 6-| , for  j,  = 0 

ifcr 

and  a = 1 , respectively.  As  by  products  they  also  provide  the  constants 
to  implement  these  rules.  It  can  be  observed  from  these  tables  that 
the  increase  in  overall  risk  wrt  tq  from  the  use  of  6*  is  only  slight 

g 

compared  to  that  in  sup  r,  (t,6-i)  from  the  use  of  6 . In  this  sense, 

x€r 

6*  is  more  robust  against  other  formulation.  As  it  can  be  expected, 
such  a robustness  of  6*  becomes  more  prominent  as  the  difference 
between  the  prior  means  (34)  increases  and  the  prior  variance  (3^) 
gets  smaller.  When  the  prior  variance  is  large  and  we  have  the  same 
prior  means,  both  rules  compare  favorably  with  each  other.  In  many 
cases,  we  can  observe  that  the  r-minimax  decision  rule  compares 
favorably  with  the  given  Bayes  rule  in  terms  of  overall  risk. 


1 .34731 . 5630 .5937  | .5170 .5190  | .4210 

max(t.,0)  in  Example  2.4.1  and  d.  is  the  constant  for  implementing  the  Bayes  rule  in  Theorem  2.5.1. 


Table  IV:  i = 0 (continued 
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CHAPTER  3 

SELECTION  PROCEDURES  FOR  A PROBLEM  IN  RELIABILITY 
AND  FOR  A PROBLEM  OF  SCALE  PARAMETERS 

3.1  Introduction 

In  this  chapter  we  discuss  two  selection  problems,  one  arising 
in  reliability  theory  and  another  for  symmetric  scale  parameter 
populations.  The  first  problem  deals  with  an  t-out-of-m  system 
where  m components  are  to  be  placed  and  at  least  i of  them  should 
function  to  make  the  system  work.  In  many  situations  several  brands 
(populations)  of  components  are  available  from  which  we  need  to 
choose  m components  for  a system.  Note  that  it  is  allowed  to  draw 
more  than  one  component  from  a population.  It  is  assumed  that 
the  lifelength  of  a component  from  population  h.  is  exponentially 
distributed  with  mean  xT1  for  i = l,...,k  and  that  the  components 
in  the  system  are  statistically  independent.  Brostrom  (1977) 
considered  the  l-out-of-2  system  when  only  two  populations  are 
available.  He  assumed  a loss  function  depending  on  ( x i , x ^ ) only 
through  X-|/X2  so  that  the  problem  is  invariant  under  the  scale 
transformation,  and  then  studied  the  admissibility  and  the  minimaxity 
of  some  rules.  In  Section  3.2,  we  consider  two  cases;  (A)  m-out-of-m, 
i.e.,  series  system  and  (B)  l-out-of-2  system  when  k(k  ^2)  popula- 
tions are  available.  We  assume  a loss  function  inversely  proportional 


Jak 
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to  the  expected  lifelength  of  the  system,  and  derive  a uniformly  best 
decision  rule  among  the  permutation  invariant  rules  for  the  series 
system  and  a Bayes  rule  wrt  a natural  conjugate  prior  for  the  1-out- 
of-2  system.  Tables  to  implement  the  Bayes  rule  are  provided  at 
the  end  of  this  chapter. 

The  second  part  of  this  chapter  consists  of  the  investigation 
of  the  selection  procedures  for  scale  parameters  of  symmetric  distri- 
butions. Contributions  for  this  problem  have  been  made  by  Puri  and 
Puri  (1969),  Blumenthal  and  Patterson  (1969),  Gupta  and  McDonald  (1970), 
Bhapkar  and  Gore  (1971)  and  McDonald  (1977)  among  others.  Since  ranking 
the  populations  with  regard  to  scale  parameters  is  equivalent  to  ranking 
them  in  terms  of  measures  of  dispersion,  we  can  consider  selection 
procedures  based  on  estimators  of  measures  of  dispersion.  In 
Section  3.3,  we  consider  two  problems;  (A)  selection  of  the  t 'best' 
populations  under  the  indifference-zone  approach,  (B)  selecting  a 
subset  containing  the  'best'  population.  We  consider  selection 
procedures  based  on  the  p-th  power  sample  deviations  and  on  the 
trimmed  standard  deviations  for  problem  (A),  and  derive  a large  sample 
solution  of  the  sample  size  required  by  the  basic  probability  condi- 
tion. Asymptotic  relative  efficiencies  of  the  procedures  in  (B)  are 
shown  to  be  the  same  as  those  in  (A)  as  well  as  those  of  the  corre- 
sponding estimators  of  Bickel  and  Lehmann  (1976). 

3.2  Some  selection  problems  arising  in  reliability  theory 

Let  n^,...,nk  (k  _>  2)  denote  the  available  populations  (brands) 
and  assume  that  each  component  from  it.,  has  an  exponentially  distributed 


j 
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lifelength  with  mean  lifelength  X..  for  i = l,...,k.  Based  on  n 

independent  observations  ». . . ,X..n  from  each  it^  (i  = l,...,k),  we 

want  to  find  an  'optimal'  selection  rule  assuming  a suitable  loss 

function.  By  sufficiency  the  problem  can  be  reduced  to  the  one  based 

n 

on  X,,...,X.  where  X.  = 7 X..  has  gamma  distribution  with  mean 

l k 1 j=i  ’J 

- 1 -2 

nxi  and  variance  nxi  . Throughout  this  section  let  x^j 

denote  the  ordered  observations  and  and  x^.j  denote  the 

Ti  and  the  x associated  with  x^.j  for  i = l,...,k.  Given  x = 

(x^ , . . . ,x^) , the  posterior  risk  of  a decision  rule  6 will  '-2  denoted 
by  r(6,x). 

(A)  The  series  system 

Here  we  denote  the  action  space  by  G = { (i , , . . . ,im) : 

1 < ii  <...<  i < k}  where  (i, ■>  ) is  to  be  interpreted  as  the 

action  of  drawing  the  j-tn  component  from  tt.  for  j = l,...,m.  For 

J 

the  series  system  the  expected  lifelength  of  the  system  for  the 

m -1 

action  (i,,...,i  ) is  easily  seen  to  be  ( l X.  ) . We  will  consider 

1 m j=l  'j 

a loss  function  which  is  inversely  proportional  to  the  expected  life- 
length  corresponding  to  an  action,  i.e.,  loss  function  is  assumed  to 


L(x,  ( i i » • • • » 1m) ) = 1 xi  • 

s/  * J 


(3.2.1) 


Thus  the  posterior  risk  of  a decision  rule  <5,  which  leads  to  an  action 
(i-j ,. . . ,im)  fc  Ci  with  probability  1 is  given  by 


4 
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r(fi,x)  = E[  l A.  |x]. 
j=l  J 


(3.2.2) 


Let  N$  = (ns  = (n] ,. . . ,n$):  n1  n$  _>  1 , J = m,  ni  's 

are  integers}  and  let  a a b = min(a,b).  The  next  result  leads  to 
considerable  reduction  in  the  number  of  decision  rules  to  be 
compared  for  the  Bayes  rule  when  the  prior  of  A is  assumed  to  be 
permutational ly  symmetric. 


Lemma  3.2.1.  If  the  prior  distribution  of  A is  permutationally  sym- 
metric  on  (0,°°)  , then  the  Bayes  rule  6*  is  determined  by 


r(6*,x)  = Min  Min  r(6  ,x)  (3.2.3) 

l<s<kAm  n„€N  -s 
-s  s 

where  6n  selects  an  action  of  drawing  components  from  -n^  for 
j = 1.....S. 


Proof.  For  s = 1 , . . . ,k  Am,  let  us  define  a*  to  be  G*  = { ( i ^ . . » i s ; 

n1,...,ns):  ns  e Ng,  1 £ i\  £ k,  i\  i j,  for  j =f  j 1 } where 

( i ^ i s ; n^,...,ns)  is  interpreted  as  drawing  n^  components  from 
tt.  (j  = l,...,s).  Note  that  the  action  space  u can  be  partitioned 

j 

into  kAm  components  a (s  = l,...,kAm)  where  we  should  choose  s 

different  populations  for  m components.  Again,  can  be  written  as 


Gc  = U Gn  where  Gn  ={(i, ic):  ( i i_  ; n, n)tu*}. 

ns€Ns  -s  -s'  si  si  s s 

The  loss  function  given  by  (3.2.1)  can  be  written  as  L(A,a)  = 
s 

n.>,  for  a 6 G • Now  consider  a decision  problem  with  the 
J-1  J fj  5s 

action  space  gr  . the  above  loss  function  and  the  observation 
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vector  x.  Clearly  this  problem  is  equivalent  to  partitioning  k 

populations  *1,...,*^  into  s+1  subsets  . . ,ys  »ys+-j  ) where  Yj. 

is  of  size  1 for  j = l,...,s,  Y$+1  is  of  size  k-s  and  the  action 

(i,,...,i  ) corresponds  to  the  action  ({it.  tt . },{*.:  l<i£k, 

s ’l  \ 

i =f  i-j,...,i|^}).  Note  that  this  decision  problem  is  invariant  under 

the  permutation  group,  and  that  the  loss  function  satisfies  the 

monotonicity  and  the  invariance  properties  of  Eaton  (1967)  with 

xT^  being  the  parameter  e.  in  Eaton's  paper.  Since  the  density 

f(x,xi ) of  xi  has  the  monotone  likelihood  ratio  property  in  x and 

e.j  = xT^ , it  follows  from  Eaton's  result  that  the  rule  which  assigns 

ti ( k j +-|)  t0  Yj  for  j = l,...,s  and  the  remaining  brands  to  Ys+^  is 

Bayes  wrt  any  permutational ly  symmetric  prior  distribution  of  X. 

Hence,  the  result  follows. 

The  following  lemma  is  needed  for  the  main  result. 


Lemma  3.2.2.  Assume  that  X,,...,X.,  given  e = (e,,...,e,)€  © , are 


independently  distributed  random  variables  with  X^  having  pdf 
f(x,e . ).  If  f(x,e)  has  the  monotone  likelihood  ratio  (MLR)  property 
in  x and  e,  and  if  the  prior  distribution,  t(§),  of  e = (e.|,...,e^) 
is  permutational ly  symmetric  on  0 , then,  for  i >_  j , 


E[g(0(i))|x]  >.  E[g (e  ^ j j ) | x] 

provided  g(-)  is  non-decreasing  on  ©where  is  the  e associated 
with  x^.-j. 

Proof.  Let  fig  = {0  € 0 : e(-j)  1 e(j)^»  then 


kiMiK*:  . 
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/ [g(e^j)-g(0^j)]f(x,e)dT(e)=[/  +/  ][g(e^j)-g(e^j)]f(x,e)d 


“0 


= / [g(e^ ))-g(e(j j)][f(x,e)-f(x,o' )di (e ) 


where  f(x,e)  = n f(x.,e.)  and  e*  is  obtained  from  e by  interchanging 
i = l 1 1 

and  e^,  keeping  other  components  fixed.  Therefore, 
E[g(0(ij)-g(e^J.j)  |x]  = n (x ) / [g(e^))-g(ejj j)][f (x,e)-f(x,o‘ )] 

Ro 

d t ( 0 ) 

where  n(x)  is  a normalizing  factor.  The  result  follows  from  the  MLR 
property  of  f(x,e)  and  the  fact  that  9 ( 6 ( -f ) ) “9 ( 6 ( j ) ) — ® ^or 
e e if  g is  non-decreasing. 

Now  we  state  the  following  result. 


Theorem  3.2.1.  For  any  permutational ly  symmetric  prior  distribution 

of  x on  (0,°°)k,  the  Bayes  rule  6*  = 6 . i.e.,  6*  draws  all  m 

-1 

components  from  n,.  s. 


Proof.  It  follows  from  (3.2.2)  that  r ( 6n  ,x)  can  be  written  as 

-s 

r(6„s,x)  = E[  »(k.jtl)|x]  for  ns  E N$. 


Therefore 


r(s„  ,x)-E[(»-stl)»(k)+  J 
” s J ^ 
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' E[J2  (nj'1)(A(k-j+l)"A(k))l-3 

:>  0 by  Lemma  3.2.2. 

Thus  Min  r(6„  ,x)  = r(6c.,x)  where  6c  = 6n*  with  n*  = (m-s+1 ,1 , . . . , 


"stNs 


-s 


s - 


S 


1)  € N$,  i.e.,6$  draws  (m-s+1)  components  from  and  one  component 
from  each  )(J  = 2,... >s)-  Since,  for  any  s;  2 £ s £ kAm, 

s 


r(6s,x)-r(61,x)  Et.Mx(k-j+l  )‘A(k)^ 


^ 0 by  Lemma  3.2.2, 


the  result  follows  from  Lemma  3.2.1. 

The  next  result  follows  from  considering  a permutation 

symmetric  prior  which  gives  mass  1/k!  at  each  parameter  point 

1/ 

obtained  from  a fixed  parameter  a 6 (0,°°)  by  permuting  its  components. 

Corollary  3.2.1 . The  ’natural’  rule  6*  is  uniformly  best  among  the 
permutation  invariant  rules. 

It  follows  from  the  above  result  that  the  natural  rule  6*  is 
admissible  and  minimax  among  all  decision  rules. 

Remark  3.2.1 . If  we  consider  a loss  function  L-|  (A,  (i -j , . . . ,im) ) = 

- 1 ^ - 1 

(m  Min  A.)  - ( T A.-  ) , it  can  be  easily  shown  that  Lemma  3.2.1 

l<i<k  1 j=l  ’j 

holds  for  this  loss  function.  Assuming  an  exchangeable  prior  for  A 
1/ 

on  (0,°°)  , it  can  be  verified  that  the  Bayes  rule  6*  for  the  loss 

function  L,  satisfies  r(6*,x)  = Min  r(6  ,x)  where  the  rule  6 is 
1 ‘ 1 <s<k/vm  s ‘ 


as  in  the  proof  of  Theorem  3.2.1.  Even  though  this  is  a considerable 
reduction  in  a number  of  candidates  for  the  Bayes  rule,  specification 
of  it  seems  difficult  except  when  m = 2.  Further  simplification  of 
the  Bayes  rule  with  respect  to  a specific  prior  would  be  interesting 
along  with  some  numerical  results. 

(B)  The  l-out-of-2  system 

Here,  the  action  space  iso.  = {(i,j):  1 <_  i <_  j <_  k)  where 
(i,j)  is  the  action  of  drawing  one  component  each  from  and  n ^ , 
respectively.  For  the  l-out-of-2  system  the  expected  lifelength  of 
the  system  for  the  action  (i,j)  is  given  by  XT1  +x^  - (x  .+x  . )~ . 

Again  as  in  (A),  the  loss  function  is  assumed  to  be  given  by 

L(X,  (i  ,j ))  = (x:1+xT1-(Xi+Xj)"1)'1.  (3.2.4 

As  mentioned  earlier,  Brostrom  (1977)  considered  a scale  invariant 
loss  function  obtained  by  dividing  (3.2.4)  by  L(x,(l,2)),  i.e.,  the 
loss  incurred  by  an  'intermediate'  action  which  we  don't  have  in  the 
case  when  k > 2. 

We  will  derive  a Bayes  rule  with  respect  to  a prior  distribution 
of  X.  The  prior  distribution  of  x is  assumed  to  be  independent 
natural  conjugate  gamma  distribution  with  two  parameters,  i.e.,  the 
joint  a priori  pdf  of  x is  given  by 

k _ ot  | — 0X . 

T (X ) = n ["tt — r X?"1  e 1],  a > 0 and  8 > o.  (3.2.5 
i = l not;  i 


The  improper  prior  corresponding  to  the  vague  prior  knowledge  can  be 
given  by  the  above  with  a = e = 0.  It  can  be  easily  observed  that. 
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given  X = x,  ) » • • • »x (k)  are»  a posteriori,  independently 
distributed  gamma  random  variables  with  mean  (n+a)  (X[i]+B)~^  and 
variance  (n+a)  (x^j+e)  . 

As  in  (A),  the  action  space  a can  be  partitioned  into  a-j  = 

{ ( i ,i ):  i = 1 , . . . , k } and  G2  = { ( i ,j ) : 1 <_  i < j £ k}.  Then  the 
decision  problem  with  (s  = 1,2)  is  equivalent  to  partitioning 

1Tl*’***1Tk  ^nt0  tw0  su^se^s  ’ anc*  y2  Y1  being  size  s anc* 
y2  being  of  size  k-s.  Then  by  the  same  arguments  as  in  Lemma  3.2.1, 


we  have  the  next  result. 


,k  . 


Lemma  3.2.3.  Assume  that  the  prior  of  x on  (0,°°)  is  permutationally 
symmetric.  Then  the  Bayes  rule  6*  is  given  by 


r(6*,x)  = Minir^.x),  r(62,x)} 


(3.2.6) 


where  6-|  chooses  2 components  front  ir(k),  and  $2  chooses  one  compo- 
nent from  and  another  from  tt ^ k_ ^ ^ . 

Now  we  state  the  Bayes  rule. 

Theorem  3.2.2.  Assume  that  the  prior  is  given  by  (3.2.5).  Then  the 
Bayes  rule  6*  is  given  by 


f6i  1f  xtk-i]*B  i c(x[k]tB> 

+b  > c(Xj-kj+e) 


(3.2.7) 


62  if  x[k-l] 


where  c = H_1n(0)  6 (0,1),  H (c)  = E[U?VfeV)  ]-  f (n+a)  for  c > 0 
a.n  aji  IT+C  IT+cUV  J 

and  (J,  V are  iid  gamma  random  variables  with  mean  (n+a)  and  variance 
(n+u). 


mm 
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Proof.  It  follows  from  (3.2.4)  and  (3.2.5)  that 
r(6rx)  = | E[x(k)|x]  = f x"k“+6  and 

r(62,x)  = ECX (k)x (k-l ) (k)+X (k-1 ) (k)+x (k)x(k-l  )+x(k-l ) ) 

, UV  (U+r,  V ) xr,  n+B 

■ w E[us^p]  for  r*  ■ 4?^ 

where  U and  V are  gamma  variables  with  mean  (n+a)  and  variance 
(n+ct).  Since  H (t)  is  non-increasing  in  t > 0,  r(6,,x)  >_  r(fi9,x) 
if  and  only  if  H (r.)  ^ 0,  i.e.,  r.  > H~^  (0).  Furthermore,  it  can 

Ot  j M K K ' ft  jli 

be  easily  observable  that 


H (,)  . 

a,n  U -HJV+V 

which  implies  0 < H~^n(0)  < 1- 


]- 


y (n+a)  < 0 

Hence  the  result  follows  from  Lemma 


3.2.3. 


As  it  was  pointed  out,  the  Bayes  rule  wrt  the  vague  prior 
knowledge  can  be  obtained  from  (3.2.7)  by  taking  a = 3 = 0.  It  can  be 
easily  shown  that  x-|  (Xj+b)"1  , . . . ,xk(xk+3)_1  are  marginally  independent 
beta  random  variables  with  mean  n(n+a) when  the  prior  is  given  by 
(3.2.5).  It  follows  that  the  overall  risk  of  the  rule  6-|  is  given  by 


r(<1)  -f  (n+a)6-’  E[Zm]  <§J 


where  is  the  smallest  order  statistic  based  on  a sample  of  size 
k from  beta  distribution  with  mean  a(n+a)’^.  Therefore  the  overall 
risk  of  the  Bayes  rule  6*  is  finite.  Furthermore  it  can  be  shown 
that  the  risk  function  R(a,6)  is  a continuous  function  of  \ for 


■ r 
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non-randomized  rule  6.  It  follows  that  the  Bayes  rule  6*  is  admissible 
in  the  class  of  non-randomized  decision  rules. 

Remark  3.2.2.  If  a loss  function  is  given  by  LoU.O'.j))  = 

\ ( Min  A, )"1-[aT1+xT1-(x.+a  . r1]  , it  follows  from  similar  methods 
l<1<k  i J i J 

that  the  Bayes  rule  wrt  the  prior  given  by  (3.2.5)  is  given  by 
(3.2.7)  with  c = g;!„  (0)6  (0.1)  and  G0-n(t)  - [ ^ ^ * 

E['£y+y]  for  t > 0 where  U and  V are  iid  gamma  random  variables  with 
mean  and  variance  equal  to  (n+a). 

At  the  end  of  this  chapter  tables  of  the  constants  c are  provided 
to  implement  the  Bayes  rules  given  in  Theorem  3.2.2  and  Remark  3.2.2. 
Values  of  c are  found  by  numerically  integrating  H n(c)  and  G n(c) 
using  Laguerre  polynomials.  In  doing  this  the  first  fifteen 
Laguerre  polynomials  were  used  (see  Abramowitz  and  Stegun  (1964)). 

3.3  Some  selection  procedures  for  symmetric  scale  parameter 
populations 

Let  denote  k independent  populations  with  continuous 

cumulative  distribution  functions  F-j(x)  = F(x/o-| ) , . . . , (x ) = 

F(x/ok),  respectively,  where  > 0,  i = l,2,...,k,  and  F is  symmetric 
about  the  origin.  Suppose  we  are  interested  in  ranking  or  screening 
these  populations  with  regard  to  some  measures  of  dispersion.  One 
measure  of  dispersion  for  a symmetric  distribution  G is  a non-negative 
functional  t(G)  or  equivalently  t(Z)  with  Z being  a random  variable 
with  distribution  function  G (see  Bickel  and  Lehmann  (1976))  such 


f 
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x(aZ)  = ax(Z)  for  a > 0 

x(Z+b)  = x(Z)  for  any  real  b and  (3.3.1) 

t(Z)  x (Z 1 ) whenever  |Z'|  is  stochastically  larger  than  |Z|. 

It  follows  that  the  measure  of  dispersion  of  F..  is  x(F^)  = x(F)  for 
i = l,...,k.  Hence  ranking  the  populations  with  regard  to  the  measure 
of  dispersion  becomes  equivalent  to  the  ranking  in  terms  of  o.'s.  This 
leads  to  selection  and  ranking  problem  in  terms  of  ck's. 

Several  procedures  for  this  problem  have  been  proposed.  When  we  know 
the  functional  form  of  F,  some  suitable  estimators  of  ck's  are  usually 
used  for  selection  or  ranking  purpose.  For  example  we  might  use  sample 
standard  deviations  for  normal  populations  and  sample  mean  deviations 
for  double  exponential  populations.  When  we  do  not  assume  the  func- 
tional form  of  F,  we  may  use  the  estimators  of  some  measures  of  disper- 
sion of  F^'s  and  study  the  robustness  of  the  selection  procedures. 

This  is  the  approach  taken  in  this  section. 


( A ) Selection  of  the  t 'best'  populations  - Indifference-zone  approach 

Let  X., . . ,X.  ( i =1 ,. . . , k)  be  the  independent  observations  from 
1 in  x-9 . 

population  ir^  with  cdf  F^(x)=F(-q  ^ where  F is  continuous  and  sym- 
metric about  the  origin.  Here  are  unknown  and  e^,...,0|<  may 

be  either  known  or  unknown.  Let  o^j  be  the  ordered  a^'s 

where  no  a priori  information  about  the  correct  pairing  of  ti^  and  o^j 
is  assumed.  Our  goal  is  to  select  the  t(l  <_  t <_  k-1)  populations 
associated  with  t smallest  scale  parameters  °[] ] » • • • »°[t]  based  on  the 
independent  observations  X^  ,. . . ,Xin,  i = l,...,k. 

The  indifference-zone  formulation,  due  to  Bechhofer  (1954),  of 


this  problem  may  be  briefly  described  as  follows; 
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(i)  Choose  an  appropriate  statistic  T and  observe  T\-Tjn^= 
T(Xil,...,Xin)  for  i = 1 ,k. 

(ii)  Use  the  natural  procedure  Ry  which  selects  populations 
associated  with  t smallest  T..  values. 

(iii )  For  preassigned  P*  e (l/(y),l)  and  A > 1 , determine  the  (3.3.2) 

smallest  sample  size  n=n(k,P*,A)  such  that  inf  P (CS | ct ) >_  P* 

o£S2(a) 

where  £2(A)={o=(a-| ,. . . .o^) : a[t+i]  1 Aa[t]^  and  stands  for  a 
correct  selection  of  the  t populations  associated  with 

°[1] °[t]* 

Several  optimum  properties  of  the  above  procedure  with  suitable  statis- 
tic T have  been  proved  under  certain  assumptions  on  F (see,  for  example, 
Bahadur  and  Goodman  (1952),  Lehmann  (1966),  Eaton  (1967)). 

First  we  will  consider  the  case  when  the  location  parameters  are 
known. 


Case  (I):  0-j ,. ..  ,ok  known 


In  this  case  we  may  assume  e-j  =...=  = 0 without  loss  of 

generality.  Let  us  consider  the  following  selection  procedures  based 
on  robust  estimators  studied  by  Bickel  and  Lehmann  (1976);  for 
1 < p < 2 and  0<a<.^-<_3<l» 


Procedure  R(p):  Use  TCX^  . . ,Xir|)  = (^  £ 

J 1 


Xij 


[p)p  in  (3.3.2) 


[ne]  ? i 

Procedure  R(a,e):  Use  T(X ,X.  )=[  £ X .r,-,/([ns)-[na])]  2 

11  in  j=[na]+l  UjJ 

in  (3.3.2), 
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2 ? 2 2 
where  Xi[n]  denote  t,ie  ordered  , . . . ,Xin  and  [•]  is 

the  greatest  integer  function.  Mote  that  R(2)  is  the  procedure 

proposed  by  Bechhofer  and  Sobel  (1954)  for  the  normal  populations. 

This  procedure  will  be  taken  as  the  standard  one  for  the  comparison 

of  procedures. 

The  next  result  gives  the  least  favorable  configurations  for 
the  above  procedures. 

Lemma  3.3.1.  Let  Ry  denote  the  procedure  defined  by  (3.3.2)  where 
the  statistic  T is  scale  invariant,  i.e.,  T(aX^-|,...,aX^  ) = 
aT(Xil ,. . . ,Xin)  for  any  a > 0.  Then  the  following  holds: 


inf  P(CS|o)  = P(CS|o  = (1  ,...,1,A A)) 

2€n(A)  t-times 


(3.3.3) 


Proof.  Using  a theorem  of  Barr  and  Rizvi  (1966)  it  follows  that 

P(CS|o)  = P(  Max  Tm  < Min  Tm|a)  with  Tm  T,.,  being 

l<j<t  u;  t+1  <j<k  u;  ' [ [ ‘ 

associated  with  a^-j  <...<  is  a non-decreasing  function  of 

°[t+l  ] ' ‘ ‘ ‘ ,0[k]  anc*  non-increasing  in  ... . Therefore 

the  result  follows  from  the  scale  i 'variance  of  T. 

It  follows  from  the  above  result  that  we  have  the  same  least 

favorable  configuration  as  long  as  the  statistic  T is  scale 

invariant.  However,  this  slippage  configuration  of  parameters  is 

not  the  least  favorable  for  the  procedures  based  on  ranks  (see 

Rizvi  and  Woodworth  (1970)).  Now  the  sample  size  n required  by  the 

basic  probability  condition  can  be  found  by 


P(  Max  T.  < Min  T. |o)  = P* 
l<j<t  J t+l<j<k  J 


(3.3.4) 
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subject  to  the  condition 

a = (1  ..  ,1  ,A,. . . ,a).  (3.3.5) 

To  solve  this  we  need  the  complete  knowledge  about  F.  But  if  we 
do  not  wish  to  rely  on  the  distributional  assumption  on  F,  we  can 
find  a large  sample  solution. 

Large  Sample  Solution  and  Asymptotic  Efficiency 

To  find  the  large  sample  solution  for  n,  following 
Lehmann  (1963),  we  use  the  device  of  replacing  a by  A^  ar J determine 
the  large  sample  solution  of  n required  to  satisfy  (3.3.4)  subject 
to  the  condition 

°[k]  = ‘ = °[t+l]  = An°[t]  = An  °[1]‘  (3.3.6) 

As  Lehmann  (1963)  and  Puri  and  Puri  (1969)  have  pointed  out, 
considering  An  as  a sequence  depending  on  n is  only  a mathematical 
device  to  approximate  the  actual  situation  and  in  practice  An  will  be 
identified  with  the  given  value  of  A. 

Assume  that  /n  (T(Z-j,...,Z  )-t(F))  is  asymptotically  normally 
distributed  with  mean  0 and  variance  v (F)  when  Z-|,...,Z  are 
iid  random  variables  with  cdf  F(z),  where  t(F)  is  a measure  of 

dispersion  of  F satisfying  (3.3.1).  Let  Yi  = T(Z^ ZiR)  for 

i = l,...,k.,  where  Z^,...,Z^n  are  kn  iid  random  variables  with  cdf 
F(-).  Then  from  (3.3.6),  we  see  that  equation  (3.3.4)  is  equivalent 
to  P(Yi  <_  ar  Y j , i = l,...,t,  j = t+l,...,k)  = P*.  This  implies, 
after  taking  logarithmetic  transformation,  that 
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1 im  P(U.-U,  < /H  (log  An)T(F)/v(F) , i = l,...,t,  j = t+1 , . . . ,k)=P* 
n-w  J 

where  *1)^  are  i id  standard  normal  variables.  This  equation 

will  be  satisfied  if  and  only  if 


A = 1+  — 
n v^n 


where  A*  is  determined  by 


P*  = t / 4>t_1(x)«>k~t(A*-x)d<&(x). 

— oo 

Thus  we  have  proved  the  following  result. 


(3.3.7) 


Lemma  3.3.2.  Consider  the  procedure  Ry  in  (3.3.2)  with  T being 
scale  invariant  and  satisfying  the  properties  in  the  above  paragraph. 
Let  n be  the  solution  of  (3.3.4)  subject  to  (3.3.6),  then  as  n « 


A = 1 + — 49-  + o(n^)  (3.3.8) 

0 vfi-  T(F) 

where  A*  is  determined  by  (3.3.7). 

It  follows  from  the  above  result  that,  for  given  a and  P*, 
a large  sample  solution  of  n is  given  by 


n - t A*  \2 

n " W 


v2(F) 

t2(F) 


(3.3.9) 


The  values  of  A*  satisfying  (3.3.7)  can  be  found  in  Bechhofer  (1954). 

From  the  central  limit  theorem  it  follows  that  if  /x^dF(x)  < °°,  ^ 

then  (3.3.9)  holds  for  R ( p ) (1  <_  p <_  2)  with  t(F)  = (/ 1 x | ^dF (x ) )p 

2 .2 

and  v2(F)  = p"2  (/|x|pdF(x))p  [/ |x | 2pdF(x)-(/|x | pdF(x ) )2] . 

Therefore,  the  large  sample  solution  for  the  procedure  R(p)  is  given  by 
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n(p)  = (|^)2[ 


El  Z 


2p 


(E|Z|P)‘ 


-1]  p' 


(3.3.10) 


where  the  expectations  are  taken  wrt  the  distribution  F(-)-  Let 

i [p£]  ? 1 

us  consider  R(a,e)  and  let  Tn(„,B>  ■ j=[L]+|Zm]  “ 

2 2 2 2 
where  Z j- 1 -j  <_. . .<_  Zj-nj  are  the  ordered  Z-j ... . ,Zn  with  Z-j ,. . . ,Zn 

being  iid  random  variables  with  cdf  F(-)-  Then  the  following 

theorem  (see,  for  example,  Stigler  (1973))  gives  the  sufficient 

condition  for  the  procedure  R(a,B)  to  satisfy  the  assumptions  in  the 

above  lemma. 


o 

Theorem  3.3.1 . Let  G(-)  denote  the  cdf  of  Z-j.  Assume 
that  a = G-1(a)  and  b = G_1(b)  are  uniquely  determined.  Then 

O Q 

as  n + the  limiting  distribution  of  /n  (T^(a,B)-x^(a,6))  is  normal 

2 i b 

with  mean  0 and  variance  v (ct,c)  w!'  t(u,3)  = [-r — /ydG(y)]‘i, 

p-a  J 

2 2 ? * 

v (a,B)  = (6-a)  [(3-a)c+(b-T2(a,B))':iB(l-B^+(a-T2(a,B))2a(l-a) 

-2(b-T2(a,B) ) (a-t2(a,s) )a(l-B)]  and  c=£~  / y2dG(y)-x (a,B)4. 

d 

It  follows  that  under  the  assumptions  in  the  above  theorem,  the 
large  sample  solution  for  the  procedure  R(a,B)  is  given  by 


n(a‘e)=  }(frr)2  4^^-  (3.3.11) 

t (a,B) 

Now  comparison  between  the  procedures  is  in  order.  To  this  end, 
the  procedure  R(2)  is  taken  as  the  standard  one  and,  following 
Lehmann  (1963),  the  asymptotic  relative  efficiency  (ARE)  of  a 
procedure  wrt  R(2)  is  then  defined  to  be  the  limiting  ratio  of 
the  reciprocals  of  the  corresponding  sample  size  required  to  achieve 
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the  same  minimum  probability  of  a correct  selection  over  u( a). 
These  are  given  as  follows: 

ARE ( R( p ) ,R(2)  ;F)  = ^ ^T2  -H/C-  -1] 

4 (El2)2  (ElZl'V 


(3.3.12) 


ARE(R(a,e),R(2);F)  = [ 


EZm 


(EZ2)2 


-1] 


(a,B) 


V (a, 6) 

where  the  expectations  are  taken  wrt  the  distribution  F(-)-  Note 
that  the  above  ARE's  are  the  same  as  those  of  the  estimators  in 

Bickel  and  Lehmann  (1976)  where  one  can  find  several  examples  of 

.2 , 


the  ARE's.  For  example,  ARE(R(p) ,R(2) ;F)  > 


2 ' 2 
IT  -p  for 


4/tT  r(p+  j) 

all  F which  is  the  cdf  of  a scale  mixture  of  normal  distributions 
with  a common  mean. 


Case  (II):  0-,,...,e,  unknown 

! k 1 

, n — 

In  this  case  we  use  Ti  = TU^  , . . . ,Xin)  = (^  J IX^-X.  |p)p  for 

J I 

[nf]  o i 

the  procedure  R(p)  and  T.  = [ Y (X . r ) /( [n6]-[na] )]  * for  the 

1 j=[na]+l  1 

procedure  R(a,e)  where  FT  is  the  median  of  -j , . . . ,X -n  and  L = ^ 
n 

l X...  Then  it  follows  from  Bickel  and  Lehmann  (1976)  that  the 
j=l 

results  analogous  to  the  case  (I)  hold  for  R(p)  without  any  further 
assumption,  and  hold  for  R(a,e)  with  the  further  assumption  that  F 
is  differentiable  with  positive  and  continuous  derivative  f. 

(B)  Selection  of  a subset  containing  the  'best'  population 
Here  we  consider  the  problem  of  selecting  a subset  of  k 
populations  which  includes  the  population  associated  with  the 


.J1 

"A 


Is 
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smallest  scale  parameter  with  at  least  probability  P*.  This  problem 
has  been  studied  by  Gupta  and  Sobel  (1962)  and  Gupta  (1965)  when 
a specific  form  of  the  cdf  is  assumed.  Also  nonparametric  or 
robust  selection  procedures  have  been  proposed  and  studied  by 
Sobel  (1967),  Blumenthal  and  Patterson  (1969),  Wong  (1976)  and 
McDonald  (1977)  among  others. 

Let  be  the  independent  observations  from  n.  with 

x-e, 

cdf  F.(x)  = F( -)  for  i = l,...,k.  Here,  F is  assumed  to  be 

1 °i 

continuous  and  symmetric  about  the  origin, is  unknown  and  Gj  is 
known.  We  may  assume  =...=  = 0.  Let  denote 

the  ordered  o's  and  denote  the  population  it  associated  with 

oj-^.  Our  goal  is  to  select  a subset  of  random  size  depending  on 


the 


observed  data  so  that,  for  given  P*(£  < p*  < 1), 


inf  P(rS|o)  > r* 


(3.3.13) 


where  si  = {a=(a, ,. . . ,0^) : > 0,  i = 1 , . . . , k } and  CS  denotes  the 

correct  selection  of  a subset  which  includes 

We  will  consider  the  procedure  R*(p)(l  1 P < 2)  which  includes 

tt.  in  the  selected  subset  if  T.  < c"^  Min  T.  where  T.  = T(X., ,. . . ,X.  ) 
1 1_  l<j<k  J 1 11  in 

l n 

('n~  I !x-hIP)P  ancl  c(°  < c i 0 is  determined  to  satisfy  (3.3.13). 
n j=l 

Then  it  can  be  easily  shown  that,  for  the  procedure  R*(p),the 

infimum  of  P(CS|o)  occurs  when  all  the  a.'s  are  equal;  therefore, 

00  k-1 

the  constant  c = c(n,k,p,P*)  is  obtained  by  /[l-G(cx)]  dG(x)  = P*, 

0 

where  G is  the  cdf  of  T1  when  o^=l.  By  the  same  reason  as  in  (A) 
one  can  find  a large  sample  solution  of  c if  he  wishes  to  avoid  the 
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distributional  assumption  on  F.  Similarly  as  in  (A)  we  can  get  the 
large  sample  solution  of  c as  in  the  next  result. 


Lemma  3.3.3.  Let  c be  the  constant  satisfying  the  basic  probability 
condition.  Then,  as  n + ®, 

c ■ 1-d  nV1[-E-L^rT-l]*+o(n'4)  (3.3.14) 

(E|Z|P)2 

where  the  expectations  are  taken  wrt  the  distribution  F(-)  and  d is 
determined  by 

/*k_1(x+d)d*(x)  = P*.  (3.3.15) 


Values  of  d satisfying  (3.3.15)  are  available  in  Gupta  (1956) 

for  various  values  of  k and  P*.  Let  Sp  and  denote  the  random 

size  of  the  selected  subset  and  the  random  number  of  the  non-best 

populations  in  the  selected  subset,  respectively  for  the  procedure 

R*(p) . It  follows  from  the  result  in  Gupta  and  Sobel  (1962)  that 

sup  E(S  | a ) = kP*  provided  f(x)  = F'(x)  is  log-concave.  Since 
P 

small  values  of  S*  are  desirable,  we  would  like  to  keep  E(S*|a) 
as  small  as  possible.  Let  us  consider  the  following  slippage 
configuration; 


Ao[l]  = °[2]  =-*-=  a[k3 


for  some  A > 1 . 


(3.3.16) 


Then  for  a given  6 > 0,  we  would  like  to  have  small  sample  size  n 
where  n is  determined,  subject  to  (3.3.16),  by 

E(S* |g)  = 6 • (3.3.17) 


i 


Following  McDonald  (1969)  we  define  the  asymptotic  relative 
efficiency  ARE(p,2;a,F)  of  R*(p)  to  be  the  limiting  ratio  of 


n(2,€)  to  n(p,£)  as  e ->0.  To  find  the  large  sample  solution  of 
(3.3.17),  we  use  the  device  of  replacing  A by  A . 
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Lemma  3. 3.4.  Assume  (3.3.14)  and  (3.3.16).  Then  as  n -*•  », 


_.i  _ 1 r I 7 I 

A = 1+A*n  *p  [ --1  1 -- 
" (E|Z|p) 


2 -1]  + o(n'*) 


(3.3. 


where  the  expectations  are  taken  wrt  F(-)  and  A*  is  determined 
by  (k-1)  /$k‘2(d-x)$(d-A*-x)d$(x)  = 6 for  d given  by  (3.3.15). 

Hence  the  large  sample  solution  of  (3.3.17),  keeping  in  terms 

1 

of  order  n*,  is  given  by 


„(p.e)  = (^)V2[-^r  -u. 


I « 1 / e L n 0 

A1  (E|Z|P)Z 
It  follows  that  the  ARE  of  R*(p)  relative  to  R(2)  is  given  by 


ARE(p,2;A,F)  = \ [-^TT 

4 (EZ; 

which  is  the  same  as  that  in  (A). 


1]/ 


mi 


2p 


(E|Z|P)2 


-13 
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Table  VI 

Lists  c-values  to  implement  the  Bayes  rule  in  Theorem  3.2.2. 
which  depends  on  n and  a through  the  quantity  m = n+m. 


m 

c 

m 

c 

m 

c 

1.0 

.3033 

11.0 

.8870 

21.0 

.9388 

1.5 

.4392 

11.5 

.8916 

21.5 

.9402 

2.0 

.5342 

12.0 

.8958 

22.0 

.9415 

2.5 

.6021 

12.5 

.8997 

22.5 

.9428 

3.0 

.6530 

13.0 

.9034 

23.0 

.9440 

3.5 

.6925 

13.5 

.9067 

23.5 

.9451 

4.0 

.7240 

14.0 

.9099 

24.0 

.9462 

4.5 

.7496 

14.5 

.9128 

24.5 

.9473 

5.0 

.7710 

15.0 

.9156 

25.0 

.9483 

5.5 

.7890 

15.5 

.9182 

25.5 

.9493 

6.0 

.8044 

16.0 

.9206 

26.0 

.9502 

6.5 

.8177 

16.5 

.9229 

26.5 

.9511 

7.0 

.8293 

17.0 

.9251 

27.0 

.9520 

7.5 

.8396 

17.5 

.9271 

27.5 

.9529 

8.0 

.8486 

18.0 

.9291 

28.0 

.9537 

8.5 

.8567 

18.5 

.9309 

28.5 

.9545 

9.0 

.8640 

19.0 

.9326 

29.0 

.9552 

9.5 

.8706 

19.5 

.9343 

29.5 

.9560 

10.0 

.8766 

20.0 

.9359 

30.0 

.9567 

10.5 

.8820 

20.5 

.9374 

30.5 

.9574 

S17ZT7SZ—-.-  ■ 
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Table  VII 

Lists  c-values  to  implement  the  Bayes  rule  in  Remark  3.2.2. 
which  depends  on  n and  a through  the  quantity  m = n+a. 


m 

c 

m 

c 

m 

c 

1.5 

.7854 

11.5 

.9817 

21.5 

.9904 

2.0 

.8592 

12.0 

.9825 

22.0 

.9907 

2.5 

.8957 

12.5 

.9832 

22.5 

.9909 

3.0 

.9173 

13.0 

.9839 

23.0 

.9911 

3.5 

.9315 

13.5 

.9845 

23.5 

.9913 

4.0 

.9415 

14.0 

.9851 

24.0 

.9915 

4.5 

.9490 

14.5 

.9856 

24.5 

.9916 

5.0 

.9548 

15.0 

.9861 

25.0 

.9918 

5.5 

.9594 

15.5 

.9866 

25.5 

.9920 

6.0 

.9631 

16.0 

.9870 

26.0 

.9921 

6.5 

.9662 

16.5 

.9874 

26.5 

.9923 

7.0 

.9688 

17.0 

.9878 

27.0 

.9924 

7.5 

.9711 

17.5 

.9882 

27.5 

.9926 

8.0 

.9730 

18.0 

.9885 

28.0 

.9927 

8.5 

.9747 

18.5 

.9888 

28.5 

.9928 

9.0 

.9762 

19.0 

.9891 

29.0 

.9930 

9.5 

.9776 

19.5 

.9894 

29.5 

.9931 

10.0 

.9788 

20.0 

.9897 

30.0 

.9932 

10.5 

.9798 

20.5 

.9900 

30.5 

.9933 

11.0 

.9808 

21.0 

.9902 

31.0 

.9934 
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